Compositions and methods relating to breast specific genes and proteins

ABSTRACT

The present invention relates to newly identified nucleic acids and polypeptides present in normal and neoplastic breast cells, including fragments, variants and derivatives of the nucleic acids and polypeptides. The present invention also relates to antibodies to the polypeptides of the invention, as well as agonists and antagonists of the polypeptides of the invention. The invention also relates to compositions comprising the nucleic acids, polypeptides, antibodies, variants, derivatives, agonists and antagonists of the invention and methods for the use of these compositions. These uses include identifying, diagnosing, monitoring, staging, imaging and treating breast cancer and non-cancerous disease states in breast tissue, identifying breast tissue, monitoring and identifying and/or designing agonists and antagonists of polypeptides of the invention. The uses also include gene therapy, production of transgenic animals and cells, and production of engineered breast tissue for treatment and research.

[0001] This application claims the benefit of priority from U.S.Provisional Application Serial No. 60/249,992 filed Nov. 20, 2000, whichis herein incorporated by reference in its entirety.

FIELD OF THE INVENTION

[0002] The present invention relates to newly identified nucleic acidmolecules and polypeptides present in normal and neoplastic breastcells, including fragments, variants and derivatives of the nucleicacids and polypeptides. The present invention also relates to antibodiesto the polypeptides of the invention, as well as agonists andantagonists of the polypeptides of the invention. The invention alsorelates to compositions comprising the nucleic acids, polypeptides,antibodies, variants, derivatives, agonists and antagonists of theinvention and methods for the use of these compositions. These usesinclude identifying, diagnosing, monitoring, staging, imaging andtreating breast cancer and non-cancerous disease states in breasttissue, identifying breast tissue and monitoring and identifying and/ordesigning agonists and antagonists of polypeptides of the invention. Theuses also include gene therapy, production of transgenic animals andcells, and production of engineered breast tissue for treatment andresearch.

BACKGROUND OF THE INVENTION

[0003] Excluding skin cancer, breast cancer, also called mammary tumor,is the most common cancer among women, accounting for a third of thecancers diagnosed in the United States. One in nine women will developbreast cancer in her lifetime and about 192,000 new cases of breastcancer are diagnosed annually with about 42,000 deaths. Bevers, PrimaryPrevention of Breast Cancer, in BREAST CANCER, 20-54 (Kelly K Hunt etal., ed., 2001); Kochanek et al., 49 Nat'l. Vital Statistics Reports 1,14 (2001).

[0004] In the treatment of breast cancer, there is considerable emphasison detection and risk assessment because early and accurate staging ofbreast cancer has a significant impact on survival. For example, breastcancer detected at an early stage (stage T0, discussed below) has afive-year survival rate of 92%. Conversely, if the cancer is notdetected until a late stage (i.e., stage T4), the five-year survivalrate is reduced to 13%. AJCC Cancer Staging Handbook pp. 164-65 (IrvinD. Fleming et al. eds., 5^(th) ed. 1998). Some detection techniques,such as mammography and biopsy, involve increased discomfort, expense,and/or radiation, and are only prescribed only to patients with anincreased risk of breast cancer.

[0005] Current methods for predicting or detecting breast cancer riskare not optimal. One method for predicting the relative risk of breastcancer is by examining a patient's risk factors and pursuing aggressivediagnostic and treatment regiments for high risk patients. A patient'srisk of breast cancer has been positively associated with increasingage, nulliparity, family history of breast cancer, personal history ofbreast cancer, early menarche, late menopause, late age of first fullterm pregnancy, prior proliferative breast disease, irradiation of thebreast at an early age and a personal history of malignancy. Lifestylefactors such as fat consumption, alcohol consumption, education, andsocioeconomic status have also been associated with an increasedincidence of breast cancer although a direct cause and effectrelationship has not been established. While these risk factors arestatistically significant, their weak association with breast cancerlimited their usefulness. Most women who develop breast cancer have noneof the risk factors listed above, other than the risk that comes withgrowing older. NIH Publication No. 00-1556 (2000).

[0006] Current screening methods for detecting cancer, such as breastself exam, ultrasound, and mammography have drawbacks that reduce theireffectiveness or prevent their widespread adoption. Breast self exams,while useful, are unreliable for the detection of breast cancer in theinitial stages where the tumor is small and difficult to detect bypalpitation. Ultrasound measurements require skilled operators at anincreased expense. Mammography, while sensitive, is subject to overdiagnosis in the detection of lesions that have questionable malignantpotential. There is also the fear of the radiation used in mammographybecause prior chest radiation is a factor associated with an increaseincidence of breast cancer.

[0007] At this time, there are no adequate methods of breast cancerprevention. The current methods of breast cancer prevention involveprophylactic mastectomy (mastectomy performed before cancer diagnosis)and chemoprevention (chemotherapy before cancer diagnosis) which aredrastic measures that limit their adoption even among women withincreased risk of breast cancer. Bevers, supra.

[0008] A number of genetic markers have been associated with breastcancer. Examples of these markers include carcinoembryonic antigen (CEA)(Mughal et al., 249 JAMA 1881 (1983)) MUC-1 (Frische and Liu, 22 J.Clin. Ligand 320 (2000)), HER-2/neu (Haris et al., 15Proc.Am.Soc.Clin.Oncology. A96 (1996)), uPA, PAI-1, LPA, LPC, RAK andBRCA (Esteva and Fritsche, Serum and Tissue Markers for Breast Cancer,in BREAST CANCER, 286-308 (2001)). These markers have problems withlimited sensitivity, low correlation, and false negatives which limittheir use for initial diagnosis. For example, while the BRCA1 genemutation is useful as an indicator of an increased risk for breastcancer, it has limited use in cancer diagnosis because only 6.2% ofbreast cancers are BRCA1 positive. Malone et al., 279 JAMA 922 (1998).See also, Mewman et al., 279 JAMA 915 (1998) (correlation of only 3.3%).

[0009] Breast cancers are diagnosed into the appropriate stagecategories recognizing that different treatments are more effective fordifferent stages of cancer. Stage TX indicates that primary tumor cannotbe assessed (i.e., tumor was removed or breast tissue was removed).Stage T0 is characterized by abnormalities such as hyperplasia but withno evidence of primary tumor. Stage Tis is characterized by carcinoma insitu, intraductal carcinoma, lobular carcinoma in situ, or Paget'sdisease of the nipple with no tumor. Stage T1 is characterized as havinga tumor of 2 cm or less in the greatest dimension. Within stage T1, Tmicindicates microinvasion of 0.1 cm or less, T1a indicates a tumor ofbetween 0.1 to 0.5 cm, T1b indicates a tumor of between 0.5 to 1 cm, andT1c indicates tumors of between 1 cm to 2 cm. Stage T2 is characterizedby tumors from 2 cm to 5 cm in the greatest dimension. Tumors greaterthan 5 cm in size are classified as stage T4. Within stage T4, T4aindicates extension of the tumor to the chess wall, T4b indicates edemaor ulceration of the skin of the breast or satellite skin nodulesconfined to the same breast, T4c indicates a combination of T4a and T4b,and T4d indicates inflammatory carcinoma. AJCC Cancer Staging Handbookpp. 159-70 (Irvin D. Fleming et al. eds., 5^(th) ed. 1998). In additionto standard staging, breast tumors may be classified according to theirestrogen receptor and progesterone receptor protein status. Fisher etal., 7 Breast Cancer Research and Treatment 147 (1986). Additionalpathological status, such as HER2/neu status may also be useful. Thor etal., 90 J. Nat'l. Cancer Inst. 1346 (1998); Paik et al., 90 J. Nat'l.Cancer Inst. 1361 (1998); Hutchins et al., 17 Proc. Am. Soc. Clin.Oncology A2 (1998).; and Simpson et al., 18 J. Clin. Oncology 2059(2000).

[0010] In addition to the staging of the primary tumor, breast cancermetastases to regional lymph nodes may be staged. Stage NX indicatesthat the lymph nodes cannot be assessed (e.g., previously removed).Stage N0 indicates no regional lymph node metastasis. Stage N1 indicatesmetastasis to movable ipsilateral axillary lymph nodes. Stage N2indicates metastasis to ipsilateral axillary lymph nodes fixed to oneanother or to other structures. Stage N3 indicates metastasis toipsilateral internal mammary lymph nodes. Id.

[0011] Stage determination has potential prognostic value and providescriteria for designing optimal therapy. Simpson et al., 18 J. Clin.Oncology 2059 (2000). Generally, pathological staging of breast canceris preferable to clinical staging because the former gives a moreaccurate prognosis. However, clinical staging would be preferred if itwere as accurate as pathological staging because it does not depend onan invasive procedure to obtain tissue for pathological evaluation.Staging of breast cancer would be improved by detecting new markers incells, tissues, or bodily fluids which could differentiate betweendifferent stages of invasion. Progress in this field will allow morerapid and reliable method for treating breast cancer patients.

[0012] Treatment of breast cancer is generally decided after an accuratestaging of the primary tumor. Primary treatment options include breastconserving therapy (lumpectomy, breast irradiation, and surgical stagingof the axilla), and modified radical mastectomy. Additional treatmentsinclude chemotherapy, regional irradiation, and, in extreme cases,terminating estrogen production by ovarian ablation.

[0013] Until recently, the customary treatment for all breast cancer wasmastectomy. Fonseca et al., 127 Annals of Internal Medicine 1013 (1997).However, recent data indicate that less radical procedures may beequally effective, in terms of survival, for early stage breast cancer.Fisher et al., 16 J. of Clinical Oncology 441 (1998). The treatmentoptions for a patient with early stage breast cancer (i.e., stage Tis)may be breast-sparing surgery followed by localized radiation therapy atthe breast. Alternatively, mastectomy optionally coupled with radiationor breast reconstruction may be employed. These treatment methods areequally effective in the early stages of breast cancer.

[0014] Patients with stage I and stage II breast cancer require surgerywith chemotherapy and/or hormonal therapy. Surgery is of limited use inStage III and stage IV patients. Thus, these patients are bettercandidates for chemotherapy and radiation therapy with surgery limitedto biopsy to permit initial staging or subsequent restaging becausecancer is rarely curative at this stage of the disease. AJCC CancerStaging Handbook 84, ¶. 164-65 (Irvin D. Fleming et al. eds., 5^(th) ed.1998).

[0015] In an effort to provide more treatment options to patients,efforts are underway to define an earlier stage of breast cancer withlow recurrence which could be treated with lumpectomy withoutpostoperative radiation treatment. While a number of attempts have beenmade to classify early stage breast cancer, no consensus recommendationon postoperative radiation treatment has been obtained from thesestudies. Page et al., 75 Cancer 1219 (1995); Fisher et al., 75 Cancer1223 (1995); Silverstein et al., 77 Cancer 2267 (1996).

[0016] As discussed above, each of the methods for diagnosing andstaging breast cancer is limited by the technology employed.Accordingly, there is need for sensitive molecular and cellular markersfor the detection of breast cancer. There is a need for molecularmarkers for the accurate staging, including clinical and pathologicalstaging, of breast cancers to optimize treatment methods. Finally, thereis a need for sensitive molecular and cellular markers to monitor theprogress of cancer treatments, including markers that can detectrecurrence of breast cancers following remission.

[0017] Other objects, features, advantages and aspects of the presentinvention will become apparent to those of skill in the art from thefollowing description. It should be understood, however, that thefollowing description and the specific examples, while indicatingpreferred embodiments of the invention, are given by way of illustrationonly. Various changes and modifications within the spirit and scope ofthe disclosed invention will become readily apparent to those skilled inthe art from reading the following description and from reading theother parts of the present disclosure.

SUMMARY OF THE INVENTION

[0018] The present invention solves these and other needs in the art byproviding nucleic acid molecules and polypeptides as well as antibodies,agonists and antagonists, thereto that may be used to identify,diagnose, monitor, stage, image and treat breast cancer andnon-cancerous disease states in breast; identify and monitor breasttissue; and identify and design agonists and antagonists of polypeptidesof the invention. The invention also provides gene therapy, methods forproducing transgenic animals and cells, and methods for producingengineered breast tissue for treatment and research.

[0019] Accordingly, one object of the invention is to provide nucleicacid molecules that are specific to breast cells and/or breast tissue.These breast specific nucleic acids (BSNAs) may be a naturally-occurringcDNA, genomic DNA, RNA, or a fragment of one of these nucleic acids, ormay be a non-naturally-occurring nucleic acid molecule. If the BSNA isgenomic DNA, then the BSNA is a breast specific gene (BSG). In apreferred embodiment, the nucleic acid molecule encodes a polypeptidethat is specific to breast. In a more preferred embodiment, the nucleicacid molecule encodes a polypeptide that comprises an amino acidsequence of SEQ ID NO: 116 through 218. In another highly preferredembodiment, the nucleic acid molecule comprises a nucleic acid sequenceof SEQ ID NO: 1 through 115. By nucleic acid molecule, it is also meantto be inclusive of sequences that selectively hybridize or exhibitsubstantial sequence similarity to a nucleic acid molecule encoding aBSP, or that selectively hybridize or exhibit substantial sequencesimilarity to a BSNA, as well as allelic variants of a nucleic acidmolecule encoding a BSP, and allelic variants of a BSNA. Nucleic acidmolecules comprising a part of a nucleic acid sequence that encodes aBSP or that comprises a part of a nucleic acid sequence of a BSNA arealso provided.

[0020] A related object of the present invention is to provide a nucleicacid molecule comprising one or more expression control sequencescontrolling the transcription and/or translation of all or a part of aBSNA. In a preferred embodiment, the nucleic acid molecule comprises oneor more expression control sequences controlling the transcriptionand/or translation of a nucleic acid molecule that encodes all or afragment of a BSP.

[0021] Another object of the invention is to provide vectors and/or hostcells comprising a nucleic acid molecule of the instant invention. In apreferred embodiment, the nucleic acid molecule encodes all or afragment of a BSP. In another preferred embodiment, the nucleic acidmolecule comprises all or a part of a BSNA.

[0022] Another object of the invention is to provided methods for usingthe vectors and host cells comprising a nucleic acid molecule of theinstant invention to recombinantly produce polypeptides of theinvention.

[0023] Another object of the invention is to provide a polypeptideencoded by a nucleic acid molecule of the invention. In a preferredembodiment, the polypeptide is a BSP. The polypeptide may compriseeither a fragment or a full-length protein as well as a mutant protein(mutein), fusion protein, homologous protein or a polypeptide encoded byan allelic variant of a BSP.

[0024] Another object of the invention is to provide an antibody thatspecifically binds to a polypeptide of the instant invention.

[0025] Another object of the invention is to provide agonists andantagonists of the nucleic acid molecules and polypeptides of theinstant invention.

[0026] Another object of the invention is to provide methods for usingthe nucleic acid molecules to detect or amplify nucleic acid moleculesthat have similar or identical nucleic acid sequences compared to thenucleic acid molecules described herein. In a preferred embodiment, theinvention provides methods of using the nucleic acid molecules of theinvention for identifying, diagnosing, monitoring, staging, imaging andtreating breast cancer and non-cancerous disease states in breast. Inanother preferred embodiment, the invention provides methods of usingthe nucleic acid molecules of the invention for identifying and/ormonitoring breast tissue. The nucleic acid molecules of the instantinvention may also be used in gene therapy, for producing transgenicanimals and cells, and for producing engineered breast tissue fortreatment and research.

[0027] The polypeptides and/or antibodies of the instant invention mayalso be used to identify, diagnose, monitor, stage, image and treatbreast cancer and non-cancerous disease states in breast. The inventionprovides methods of using the polypeptides of the invention to identifyand/or monitor breast tissue, and to produce engineered breast tissue.

[0028] The agonists and antagonists of the instant invention may be usedto treat breast cancer and non-cancerous disease states in breast and toproduce engineered breast tissue.

[0029] Yet another object of the invention is to provide a computerreadable means of storing the nucleic acid and amino acid sequences ofthe invention. The records of the computer readable means can beaccessed for reading and displaying of sequences for comparison,alignment and ordering of the sequences of the invention to othersequences.

DETAILED DESCRIPTION OF THE INVENTION

[0030] Definitions and General Techniques

[0031] Unless otherwise defined herein, scientific and technical termsused in connection with the present invention shall have the meaningsthat are commonly understood by those of ordinary skill in the art.Further, unless otherwise required by context, singular terms shallinclude pluralities and plural terms shall include the singular.Generally, nomenclatures used in connection with, and techniques of,cell and tissue culture, molecular biology, immunology, microbiology,genetics and protein and nucleic acid chemistry and hybridizationdescribed herein are those well-known and commonly used in the art. Themethods and techniques of the present invention are generally performedaccording to conventional methods well-known in the art and as describedin various general and more specific references that are cited anddiscussed throughout the present specification unless otherwiseindicated. See, e.g., Sambrook et al., Molecular Cloning: A LaboratoryManual, 2d ed., Cold Spring Harbor Laboratory Press (1989) and Sambrooket al., Molecular Cloning: A Laboratory Manual, 3d ed., Cold SpringHarbor Press (2001); Ausubel et al., Current Protocols in MolecularBiology, Greene Publishing Associates (1992, and Supplements to 2000);Ausubel et al., Short Protocols in Molecular Biology: A Compendium ofMethods from Current Protocols in Molecular Biology-4^(th) Ed., Wiley &Sons (1999); Harlow and Lane, Antibodies: A Laboratory Manual, ColdSpring Harbor Laboratory Press (1990); and Harlow and Lane, UsingAntibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press(1999); each of which is incorporated herein by reference in itsentirety.

[0032] Enzymatic reactions and purification techniques are performedaccording to manufacturer's specifications, as commonly accomplished inthe art or as described herein. The nomenclatures used in connectionwith, and the laboratory procedures and techniques of, analyticalchemistry, synthetic organic chemistry, and medicinal and pharmaceuticalchemistry described herein are those well-known and commonly used in theart. Standard techniques are used for chemical syntheses, chemicalanalyses, pharmaceutical preparation, formulation, and delivery, andtreatment of patients.

[0033] The following terms, unless otherwise indicated, shall beunderstood to have the following meanings:

[0034] A “nucleic acid molecule” of this invention refers to a polymericform of nucleotides and includes both sense and antisense strands ofRNA, cDNA, genomic DNA, and synthetic forms and mixed polymers of theabove. A nucleotide refers to a ribonucleotide, deoxynucleotide or amodified form of either type of nucleotide. A “nucleic acid molecule” asused herein is synonymous with “nucleic acid” and “polynucleotide.” Theterm “nucleic acid molecule” usually refers to a molecule of at least 10bases in length, unless otherwise specified. The term includes single-and double-stranded forms of DNA. In addition, a polynucleotide mayinclude either or both naturally-occurring and modified nucleotideslinked together by naturally-occurring and/or non-naturally occurringnucleotide linkages.

[0035] The nucleic acid molecules may be modified chemically orbiochemically or may contain non-natural or derivatized nucleotidebases, as will be readily appreciated by those of skill in the art. Suchmodifications include, for example, labels, methylation, substitution ofone or more of the naturally occurring nucleotides with an analog,internucleotide modifications such as uncharged linkages (e.g., methylphosphonates, phosphotriesters, phosphoramidates, carbamates, etc.),charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.),pendent moieties (e.g., polypeptides), intercalators (e.g., acridine,psoralen, etc.), chelators, alkylators, and modified linkages (e.g.,alpha anomeric nucleic acids, etc.) The term “nucleic acid molecule”also includes any topological conformation, including single-stranded,double-stranded, partially duplexed, triplexed, hairpinned, circular andpadlocked conformations. Also included are synthetic molecules thatmimic polynucleotides in their ability to bind to a designated sequencevia hydrogen bonding and other chemical interactions. Such molecules areknown in the art and include, for example, those in which peptidelinkages substitute for phosphate linkages in the backbone of themolecule.

[0036] A “gene” is defined as a nucleic acid molecule that comprises anucleic acid sequence that encodes a polypeptide and the expressioncontrol sequences that surround the nucleic acid sequence that encodesthe polypeptide. For instance, a gene may comprise a promoter, one ormore enhancers, a nucleic acid sequence that encodes a polypeptide,downstream regulatory sequences and, possibly, other nucleic acidsequences involved in regulation of the expression of an RNA. As iswell-known in the art, eukaryotic genes usually contain both exons andintrons. The term “exon” refers to a nucleic acid sequence found ingenomic DNA that is bioinformatically predicted and/or experimentallyconfirmed to contribute a contiguous sequence to a mature mRNAtranscript. The term “intron” refers to a nucleic acid sequence found ingenomic DNA that is predicted and/or confirmed to not contribute to amature mRNA transcript, but rather to be “spliced out” during processingof the transcript.

[0037] A nucleic acid molecule or polypeptide is “derived” from aparticular species if the nucleic acid molecule or polypeptide has beenisolated from the particular species, or if the nucleic acid molecule orpolypeptide is homologous to a nucleic acid molecule or polypeptideisolated from a particular species.

[0038] An “isolated” or “substantially pure” nucleic acid orpolynucleotide (e.g., an RNA, DNA or a mixed polymer) is one which issubstantially separated from other cellular components that naturallyaccompany the native polynucleotide in its natural host cell, e.g.,ribosomes, polymerases, or genomic sequences with which it is naturallyassociated. The term embraces a nucleic acid or polynucleotide that (1)has been removed from its naturally occurring environment, (2) is notassociated with all or a portion of a polynucleotide in which the“isolated polynucleotide” is found in nature, (3) is operatively linkedto a polynucleotide which it is not linked to in nature, (4) does notoccur in nature as part of a larger sequence or (5) includes nucleotidesor internucleoside bonds that are not found in nature. The term“isolated” or “substantially pure” also can be used in reference torecombinant or cloned DNA isolates, chemically synthesizedpolynucleotide analogs, or polynucleotide analogs that are biologicallysynthesized by heterologous systems. The term “isolated nucleic acidmolecule” includes nucleic acid molecules that are integrated into ahost cell chromosome at a heterologous site, recombinant fusions of anative fragment to a heterologous sequence, recombinant vectors presentas episomes or as integrated into a host cell chromosome.

[0039] A “part” of a nucleic acid molecule refers to a nucleic acidmolecule that comprises a partial contiguous sequence of at least 10bases of the reference nucleic acid molecule. Preferably, a partcomprises at least 15 to 20 bases of a reference nucleic acid molecule.In theory, a nucleic acid sequence of 17 nucleotides is of sufficientlength to occur at random less frequently than once in the threegigabase human genome, and thus to provide a nucleic acid probe that canuniquely identify the reference sequence in a nucleic acid mixture ofgenomic complexity. A preferred part is one that comprises a nucleicacid sequence that can encode at least 6 contiguous amino acid sequences(fragments of at least 18 nucleotides) because they are useful indirecting the expression or synthesis of peptides that are useful inmapping the epitopes of the polypeptide encoded by the reference nucleicacid. See, e.g., Geysen et al., Proc. Natl. Acad. Sci. USA 81:3998-4002(1984); and U.S. Pat. Nos. 4,708,871 and 5,595,915, the disclosures ofwhich are incorporated herein by reference in their entireties. A partmay also comprise at least 25, 30, 35 or 40 nucleotides of a referencenucleic acid molecule, or at least 50, 60, 70, 80, 90, 100, 150, 200,250, 300, 350, 400 or 500 nucleotides of a reference nucleic acidmolecule. A part of a nucleic acid molecule may comprise no othernucleic acid sequences. Alternatively, a part of a nucleic acid maycomprise other nucleic acid sequences from other nucleic acid molecules.

[0040] The term “oligonucleotide” refers to a nucleic acid moleculegenerally comprising a length of 200 bases or fewer. The term oftenrefers to single-stranded deoxyribonucleotides, but it can refer as wellto single- or double-stranded ribonucleotides, RNA:DNA hybrids anddouble-stranded DNAs, among others. Preferably, oligonucleotides are 10to 60 bases in length and most preferably 12, 13, 14, 15, 16, 17, 18, 19or 20 bases in length. Other preferred oligonucleotides are 25, 30, 35,40, 45, 50, 55 or 60 bases in length. Oligonucleotides may besingle-stranded, e.g. for use as probes or primers, or may bedouble-stranded, e.g. for use in the construction of a mutant gene.Oligonucleotides of the invention can be either sense or antisenseoligonucleotides. An oligonucleotide can be derivatized or modified asdiscussed above for nucleic acid molecules.

[0041] Oligonucleotides, such as single-stranded DNA probeoligonucleotides, often are synthesized by chemical methods, such asthose implemented on automated oligonucleotide synthesizers. However,oligonucleotides can be made by a variety of other methods, including invitro recombinant DNA-mediated techniques and by expression of DNAs incells and organisms. Initially, chemically synthesized DNAs typicallyare obtained without a 5′ phosphate. The 5′ ends of sucholigonucleotides are not substrates for phosphodiester bond formation byligation reactions that employ DNA ligases typically used to formrecombinant DNA molecules. Where ligation of such oligonucleotides isdesired, a phosphate can be added by standard techniques, such as thosethat employ a kinase and ATP. The 3′ end of a chemically synthesizedoligonucleotide generally has a free hydroxyl group and, in the presenceof a ligase, such as T4 DNA ligase, readily will form a phosphodiesterbond with a 5′ phosphate of another polynucleotide, such as anotheroligonucleotide. As is well-known, this reaction can be preventedselectively, where desired, by removing the 5′ phosphates of the otherpolynucleotide(s) prior to ligation.

[0042] The term “naturally-occurring nucleotide” referred to hereinincludes naturally-occurring deoxyribonucleotides and ribonucleotides.The term “modified nucleotides” referred to herein includes nucleotideswith modified or substituted sugar groups and the like. The term“nucleotide linkages” referred to herein includes nucleotides linkagessuch as phosphorothioate, phosphorodithioate, phosphoroselenoate,phosphorodiselenoate, phosphoroanilothioate, phoshoraniladate,phosphoroamidate, and the like. See e.g., LaPlanche et al. Nucl. AcidsRes. 14:9081-9093 (1986); Stein et al. Nucl. Acids Res. 16:3209-3221(1988); Zon et al. Anti-Cancer Drug Design 6:539-568 (1991); Zon et al.,in Eckstein (ed.) Oligonucleotides and Analogues: A Practical Approach,pp. 87-108, Oxford University Press (1991); U.S. Pat. No. 5,151,510;Uhlmann and Peyman Chemical Reviews 90:543 (1990), the disclosures ofwhich are hereby incorporated by reference.

[0043] Unless specified otherwise, the left hand end of a polynucleotidesequence in sense orientation is the 5′ end and the right hand end ofthe sequence is the 3′ end. In addition, the left hand direction of apolynucleotide sequence in sense orientation is referred to as the 5′direction, while the right hand direction of the polynucleotide sequenceis referred to as the 3′ direction. Further, unless otherwise indicated,each nucleotide sequence is set forth herein as a sequence ofdeoxyribonucleotides. It is intended, however, that the given sequencebe interpreted as would be appropriate to the polynucleotidecomposition: for example, if the isolated nucleic acid is composed ofRNA, the given sequence intends ribonucleotides, with uridinesubstituted for thymidine.

[0044] The term “allelic variant” refers to one of two or morealternative naturally-occurring forms of a gene, wherein each genepossesses a unique nucleotide sequence. In a preferred embodiment,different alleles of a given gene have similar or identical biologicalproperties.

[0045] The term “percent sequence identity” in the context of nucleicacid sequences refers to the residues in two sequences which are thesame when aligned for maximum correspondence. The length of sequenceidentity comparison may be over a stretch of at least about ninenucleotides, usually at least about 20 nucleotides, more usually atleast about 24 nucleotides, typically at least about 28 nucleotides,more typically at least about 32 nucleotides, and preferably at leastabout 36 or more nucleotides. There are a number of different algorithmsknown in the art which can be used to measure nucleotide sequenceidentity. For instance, polynucleotide sequences can be compared usingFASTA, Gap or Bestfit, which are programs in Wisconsin Package Version10.0, Genetics Computer Group (GCG), Madison, Wis. FASTA, whichincludes, e.g., the programs FASTA2 and FASTA3, provides alignments andpercent sequence identity of the regions of the best overlap between thequery and search sequences (Pearson, Methods Enzymol. 183: 63-98 (1990);Pearson, Methods Mol. Biol. 132: 185-219 (2000); Pearson, MethodsEnzymol. 266: 227-258 (1996); Pearson, J. Mol. Biol. 276: 71-84 (1998);herein incorporated by reference). Unless otherwise specified, defaultparameters for a particular program or algorithm are used. For instance,percent sequence identity between nucleic acid sequences can bedetermined using FASTA with its default parameters (a word size of 6 andthe NOPAM factor for the scoring matrix) or using Gap with its defaultparameters as provided in GCG Version 6.1, herein incorporated byreference.

[0046] A reference to a nucleic acid sequence encompasses its complementunless otherwise specified. Thus, a reference to a nucleic acid moleculehaving a particular sequence should be understood to encompass itscomplementary strand, with its complementary sequence. The complementarystrand is also useful, e.g., for antisense therapy, hybridization probesand PCR primers.

[0047] In the molecular biology art, researchers use the terms “percentsequence identity”, “percent sequence similarity” and “percent sequencehomology” interchangeably. In this application, these terms shall havethe same meaning with respect to nucleic acid sequences only.

[0048] The term “substantial similarity” or “substantial sequencesimilarity,” when referring to a nucleic acid or fragment thereof,indicates that, when optimally aligned with appropriate nucleotideinsertions or deletions with another nucleic acid (or its complementarystrand), there is nucleotide sequence identity in at least about 50%,more preferably 60% of the nucleotide bases, usually at least about 70%,more usually at least about 80%, preferably at least about 90%, and morepreferably at least about 95-98% of the nucleotide bases, as measured byany well-known algorithm of sequence identity, such as FASTA, BLAST orGap, as discussed above.

[0049] Alternatively, substantial similarity exists when a nucleic acidor fragment thereof hybridizes to another nucleic acid, to a strand ofanother nucleic acid, or to the complementary strand thereof, underselective hybridization conditions. Typically, selective hybridizationwill occur when there is at least about 55% sequence identity,preferably at least about 65%, more preferably at least about 75%, andmost preferably at least about 90% sequence identity, over a stretch ofat least about 14 nucleotides, more preferably at least 17 nucleotides,even more preferably at least 20, 25, 30, 35, 40, 50, 60, 70, 80, 90 or100 nucleotides.

[0050] Nucleic acid hybridization will be affected by such conditions assalt concentration, temperature, solvents, the base composition of thehybridizing species, length of the complementary regions, and the numberof nucleotide base mismatches between the hybridizing nucleic acids, aswill be readily appreciated by those skilled in the art. “Stringenthybridization conditions” and “stringent wash conditions” in the contextof nucleic acid hybridization experiments depend upon a number ofdifferent physical parameters. The most important parameters includetemperature of hybridization, base composition of the nucleic acids,salt concentration and length of the nucleic acid. One having ordinaryskill in the art knows how to vary these parameters to achieve aparticular stringency of hybridization. In general, “stringenthybridization” is performed at about 25° C. below the thermal meltingpoint (T_(m)) for the specific DNA hybrid under a particular set ofconditions. “Stringent washing” is performed at temperatures about 5° C.lower than the T_(m) for the specific DNA hybrid under a particular setof conditions. The T_(m) is the temperature at which 50% of the targetsequence hybridizes to a perfectly matched probe. See Sambrook (1989),supra, p. 9.51, hereby incorporated by reference.

[0051] The T_(m) for a particular DNA-DNA hybrid can be estimated by theformula:

T _(m)=81.5° C.+16.6(log₁₀[Na⁺])+0.41(fraction G+C)−0.63(%formamide)−(600/1)

[0052] where 1 is the length of the hybrid in base pairs.

[0053] The T_(m) for a particular RNA-RNA hybrid can be estimated by theformula:

T _(m)=79.8° C.+18.5(log₁₀[Na⁺])+0.58(fraction G+C)+11.8(fractionG+C)²−0.35(% formamide)−(820/1).

[0054] The T_(m) for a particular RNA-DNA hybrid can be estimated by theformula:

T _(m)=79.8° C.+18.5(log₁₀[Na⁺])+0.58(fraction G+C)+11.8(fractionG+C)²−0.50(% formamide)−(820/1).

[0055] In general, the T_(m) decreases by 1-1.5° C. for each 1% ofmismatch between two nucleic acid sequences. Thus, one having ordinaryskill in the art can alter hybridization and/or washing conditions toobtain sequences that have higher or lower degrees of sequence identityto the target nucleic acid. For instance, to obtain hybridizing nucleicacids that contain up to 10% mismatch from the target nucleic acidsequence, 10-15° C. would be subtracted from the calculated T_(m) of aperfectly matched hybrid, and then the hybridization and washingtemperatures adjusted accordingly. Probe sequences may also hybridizespecifically to duplex DNA under certain conditions to form triplex orother higher order DNA complexes. The preparation of such probes andsuitable hybridization conditions are well-known in the art.

[0056] An example of stringent hybridization conditions forhybridization of complementary nucleic acid sequences having more than100 complementary residues on a filter in a Southern or Northern blot orfor screening a library is 50% formamide/6×SSC at 42° C. for at leastten hours and preferably overnight (approximately 16 hours). Anotherexample of stringent hybridization conditions is 6×SSC at 68° C. withoutformamide for at least ten hours and preferably overnight. An example ofmoderate stringency hybridization conditions is 6×SSC at 55° C. withoutformamide for at least ten hours and preferably overnight. An example oflow stringency hybridization conditions for hybridization ofcomplementary nucleic acid sequences having more than 100 complementaryresidues on a filter in a Southern or Northern blot or for screening alibrary is 6×SSC at 42° C. for at least ten hours. Hybridizationconditions to identify nucleic acid sequences that are similar but notidentical can be identified by experimentally changing the hybridizationtemperature from 68° C. to 42° C. while keeping the salt concentrationconstant (6×SSC), or keeping the hybridization temperature and saltconcentration constant (e.g. 42° C. and 6×SSC) and varying the formamideconcentration from 50% to 0%. Hybridization buffers may also includeblocking agents to lower background. These agents are well-known in theart. See Sambrook et al. (1989), supra, pages 8.46 and 9.46-9.58, hereinincorporated by reference. See also Ausubel (1992), supra, Ausubel(1999), supra, and Sambrook (2001), supra.

[0057] Wash conditions also can be altered to change stringencyconditions. An example of stringent wash conditions is a 0.2×SSC wash at65° C. for 15 minutes (see Sambrook (1989), supra, for SSC buffer).Often the high stringency wash is preceded by a low stringency wash toremove excess probe. An exemplary medium stringency wash for duplex DNAof more than 100 base pairs is 1×SSC at 45° C. for 15 minutes. Anexemplary low stringency wash for such a duplex is 4×SSC at 40° C. for15 minutes. In general, signal-to-noise ratio of 2× or higher than thatobserved for an unrelated probe in the particular hybridization assayindicates detection of a specific hybridization.

[0058] As defined herein, nucleic acid molecules that do not hybridizeto each other under stringent conditions are still substantially similarto one another if they encode polypeptides that are substantiallyidentical to each other. This occurs, for example, when a nucleic acidmolecule is created synthetically or recombinantly using high codondegeneracy as permitted by the redundancy of the genetic code.

[0059] Hybridization conditions for nucleic acid molecules that areshorter than 100 nucleotides in length (e.g., for oligonucleotideprobes) may be calculated by the formula:

T _(m)=81.5° C.+16.6(log₁₀[Na⁺])+0.41(fraction G+C)−(600/N),

[0060] wherein N is change length and the [Na⁺] is 1 M or less. SeeSambrook (1989), supra, p. 11.46. For hybridization of probes shorterthan 100 nucleotides, hybridization is usually performed under stringentconditions (5-10° C. below the T_(m)) using high concentrations (0.1-1.0pmol/ml) of probe. Id. at p. 11.45. Determination of hybridization usingmismatched probes, pools of degenerate probes or “guessmers,” as well ashybridization solutions and methods for empirically determininghybridization conditions are well-known in the art. See, e.g., Ausubel(1999), supra; Sambrook (1989), supra, pp. 11.45-11.57.

[0061] The term “digestion” or “digestion of DNA” refers to catalyticcleavage of the DNA with a restriction enzyme that acts only at certainsequences in the DNA. The various restriction enzymes referred to hereinare commercially available and their reaction conditions, cofactors andother requirements for use are known and routine to the skilled artisan.For analytical purposes, typically, 1 μg of plasmid or DNA fragment isdigested with about 2 units of enzyme in about 20 μl of reaction buffer.For the purpose of isolating DNA fragments for plasmid construction,typically 5 to 50 μg of DNA are digested with 20 to 250 units of enzymein proportionately larger volumes. Appropriate buffers and substrateamounts for particular restriction enzymes are described in standardlaboratory manuals, such as those referenced below, and they arespecified by commercial suppliers. Incubation times of about 1 hour at37° C. are ordinarily used, but conditions may vary in accordance withstandard procedures, the supplier's instructions and the particulars ofthe reaction. After digestion, reactions may be analyzed, and fragmentsmay be purified by electrophoresis through an agarose or polyacrylamidegel, using well-known methods that are routine for those skilled in theart.

[0062] The term “ligation” refers to the process of formingphosphodiester bonds between two or more polynucleotides, which mostoften are double-stranded DNAS. Techniques for ligation are well-knownto the art and protocols for ligation are described in standardlaboratory manuals and references, such as, e.g., Sambrook (1989),supra.

[0063] Genome-derived “single exon probes,” are probes that comprise atleast part of an exon (“reference exon”) and can hybridize detectablyunder high stringency conditions to transcript-derived nucleic acidsthat include the reference exon but do not hybridize detectably underhigh stringency conditions to nucleic acids that lack the referenceexon. Single exon probes typically further comprise, contiguous to afirst end of the exon portion, a first intronic and/or intergenicsequence that is identically contiguous to the exon in the genome, andmay contain a second intronic and/or intergenic sequence that isidentically contiguous to the exon in the genome. The minimum length ofgenome-derived single exon probes is defined by the requirement that theexonic portion be of sufficient length to hybridize under highstringency conditions to transcript-derived nucleic acids, as discussedabove. The maximum length of genome-derived single exon probes isdefined by the requirement that the probes contain portions of no morethan one exon. The single exon probes may contain priming sequences notfound in contiguity with the rest of the probe sequence in the genome,which priming sequences are useful for PCR and other amplification-basedtechnologies.

[0064] The term “microarray” or “nucleic acid microarray” refers to asubstrate-bound collection of plural nucleic acids, hybridization toeach of the plurality of bound nucleic acids being separatelydetectable. The substrate can be solid or porous, planar or non-planar,unitary or distributed. Microarrays or nucleic acid microarrays includeall the devices so called in Schena (ed.), DNA Microarrays: A PracticalApproach (Practical Approach Series), Oxford University Press (1999);Nature Genet. 21(1)(suppl.):1-60 (1999); Schena (ed.), MicroarrayBiochip: Tools and Technology, Eaton Publishing Company/BioTechniquesBooks Division (2000). These microarrays include substrate-boundcollections of plural nucleic acids in which the plurality of nucleicacids are disposed on a plurality of beads, rather than on a unitaryplanar substrate, as is described, inter alia, in Brenner et al., Proc.Natl. Acad. Sci. USA 97(4):1665-1670 (2000).

[0065] The term “mutated” when applied to nucleic acid molecules meansthat nucleotides in the nucleic acid sequence of the nucleic acidmolecule may be inserted, deleted or changed compared to a referencenucleic acid sequence. A single alteration may be made at a locus (apoint mutation) or multiple nucleotides may be inserted, deleted orchanged at a single locus. In addition, one or more alterations may bemade at any number of loci within a nucleic acid sequence. In apreferred embodiment, the nucleic acid molecule comprises the wild typenucleic acid sequence encoding a BSP or is a BSNA. The nucleic acidmolecule may be mutated by any method known in the art including thosemutagenesis techniques described infra.

[0066] The term “error-prone PCR” refers to a process for performing PCRunder conditions where the copying fidelity of the DNA polymerase islow, such that a high rate of point mutations is obtained along theentire length of the PCR product. See, e.g., Leung et al., Technique 1:11-15 (1989) and Caldwell et al., PCR Methods Applic. 2: 28-33 (1992).

[0067] The term “oligonucleotide-directed mutagenesis” refers to aprocess which enables the generation of site-specific mutations in anycloned DNA segment of interest. See, e.g., Reidhaar-Olson et al.,Science 241: 53-57 (1988).

[0068] The term “assembly PCR” refers to a process which involves theassembly of a PCR product from a mixture of small DNA fragments. A largenumber of different PCR reactions occur in parallel in the same vial,with the products of one reaction priming the products of anotherreaction.

[0069] The term “sexual PCR mutagenesis” or “DNA shuffling” refers to amethod of error-prone PCR coupled with forced homologous recombinationbetween DNA molecules of different but highly related DNA sequence invitro, caused by random fragmentation of the DNA molecule based onsequence similarity, followed by fixation of the crossover by primerextension in an error-prone PCR reaction. See, e.g., Stemmer, Proc.Natl. Acad. Sci. U.S.A. 91: 10747-10751 (1994). DNA shuffling can becarried out between several related genes (“Family shuffling”).

[0070] The term “in vivo mutagenesis” refers to a process of generatingrandom mutations in any cloned DNA of interest which involves thepropagation of the DNA in a strain of bacteria such as E. coli thatcarries mutations in one or more of the DNA repair pathways. These“mutator” strains have a higher random mutation rate than that of awild-type parent. Propagating the DNA in a mutator strain willeventually generate random mutations within the DNA.

[0071] The term “cassette mutagenesis” refers to any process forreplacing a small region of a double-stranded DNA molecule with asynthetic oligonucleotide “cassette” that differs from the nativesequence. The oligonucleotide often contains completely and/or partiallyrandomized native sequence.

[0072] The term “recursive ensemble mutagenesis” refers to an algorithmfor protein engineering (protein mutagenesis) developed to producediverse populations of phenotypically related mutants whose membersdiffer in amino acid sequence. This method uses a feedback mechanism tocontrol successive rounds of combinatorial cassette mutagenesis. See,e.g., Arkin et al., Proc. Natl. Acad. Sci. U.S.A. 89: 7811-7815 (1992).

[0073] The term “exponential ensemble mutagenesis” refers to a processfor generating combinatorial libraries with a high percentage of uniqueand functional mutants, wherein small groups of residues are randomizedin parallel to identify, at each altered position, amino acids whichlead to functional proteins. See, e.g., Delegrave et al., BiotechnologyResearch 11: 1548-1552 (1993); Arnold, Current Opinion in Biotechnology4: 450-455 (1993). Each of the references mentioned above are herebyincorporated by reference in its entirety.

[0074] “Operatively linked” expression control sequences refers to alinkage in which the expression control sequence is contiguous with thegene of interest to control the gene of interest, as well as expressioncontrol sequences that act in trans or at a distance to control the geneof interest.

[0075] The term “expression control sequence” as used herein refers topolynucleotide sequences which are necessary to affect the expression ofcoding sequences to which they are operatively linked. Expressioncontrol sequences are sequences which control the transcription,post-transcriptional events and translation of nucleic acid sequences.Expression control sequences include appropriate transcriptioninitiation, termination, promoter and enhancer sequences; efficient RNAprocessing signals such as splicing and polyadenylation signals;sequences that stabilize cytoplasmic mRNA; sequences that enhancetranslation efficiency (e.g., ribosome binding sites); sequences thatenhance protein stability; and when desired, sequences that enhanceprotein secretion. The nature of such control sequences differsdepending upon the host organism; in prokaryotes, such control sequencesgenerally include the promoter, ribosomal binding site, andtranscription termination sequence. The term “control sequences” isintended to include, at a minimum, all components whose presence isessential for expression, and can also include additional componentswhose presence is advantageous, for example, leader sequences and fusionpartner sequences.

[0076] The term “vector,” as used herein, is intended to refer to anucleic acid molecule capable of transporting another nucleic acid towhich it has been linked. One type of vector is a “plasmid”, whichrefers to a circular double-stranded DNA loop into which additional DNAsegments may be ligated. Other vectors include cosmids, bacterialartificial chromosomes (BAC) and yeast artificial chromosomes (YAC).Another type of vector is a viral vector, wherein additional DNAsegments may be ligated into the viral genome. Viral vectors that infectbacterial cells are referred to as bacteriophages. Certain vectors arecapable of autonomous replication in a host cell into which they areintroduced (e.g., bacterial vectors having a bacterial origin ofreplication). Other vectors can be integrated into the genome of a hostcell upon introduction into the host cell, and thereby are replicatedalong with the host genome. Moreover, certain vectors are capable ofdirecting the expression of genes to which they are operatively linked.Such vectors are referred to herein as “recombinant expression vectors”(or simply, “expression vectors”). In general, expression vectors ofutility in recombinant DNA techniques are often in the form of plasmids.In the present specification, “plasmid” and “vector” may be usedinterchangeably as the plasmid is the most commonly used form of vector.However, the invention is intended to include other forms of expressionvectors that serve equivalent functions.

[0077] The term “recombinant host cell” (or simply “host cell”), as usedherein, is intended to refer to a cell into which an expression vectorhas been introduced. It should be understood that such terms areintended to refer not only to the particular subject cell but to theprogeny of such a cell. Because certain modifications may occur insucceeding generations due to either mutation or environmentalinfluences, such progeny may not, in fact, be identical to the parentcell, but are still included within the scope of the term “host cell” asused herein.

[0078] As used herein, the phrase “open reading frame” and theequivalent acronym “ORF” refer to that portion of a transcript-derivednucleic acid that can be translated in its entirety into a sequence ofcontiguous amino acids. As so defined, an ORF has length, measured innucleotides, exactly divisible by 3. As so defined, an ORF need notencode the entirety of a natural protein.

[0079] As used herein, the phrase “ORF-encoded peptide” refers to thepredicted or actual translation of an ORF.

[0080] As used herein, the phrase “degenerate variant” of a referencenucleic acid sequence intends all nucleic acid sequences that can bedirectly translated, using the standard genetic code, to provide anamino acid sequence identical to that translated from the referencenucleic acid sequence.

[0081] The term “polypeptide” encompasses both naturally-occurring andnon-naturally-occurring proteins and polypeptides, polypeptide fragmentsand polypeptide mutants, derivatives and analogs. A polypeptide may bemonomeric or polymeric. Further, a polypeptide may comprise a number ofdifferent modules within a single polypeptide each of which has one ormore distinct activities. A preferred polypeptide in accordance with theinvention comprises a BSP encoded by a nucleic acid molecule of theinstant invention, as well as a fragment, mutant, analog and derivativethereof.

[0082] The term “isolated protein” or “isolated polypeptide” is aprotein or polypeptide that by virtue of its origin or source ofderivation (1) is not associated with naturally associated componentsthat accompany it in its native state, (2) is free of other proteinsfrom the same species (3) is expressed by a cell from a differentspecies, or (4) does not occur in nature. Thus, a polypeptide that ischemically synthesized or synthesized in a cellular system differentfrom the cell from which it naturally originates will be “isolated” fromits naturally associated components. A polypeptide or protein may alsobe rendered substantially free of naturally associated components byisolation, using protein purification techniques well-known in the art.

[0083] A protein or polypeptide is “substantially pure,” “substantiallyhomogeneous” or “substantially purified” when at least about 60% to 75%of a sample exhibits a single species of polypeptide. The polypeptide orprotein may be monomeric or multimeric. A substantially pure polypeptideor protein will typically comprise about 50%, 60%, 70%, 80% or 90% W/Wof a protein sample, more usually about 95%, and preferably will be over99% pure. Protein purity or homogeneity may be indicated by a number ofmeans well-known in the art, such as polyacrylamide gel electrophoresisof a protein sample, followed by visualizing a single polypeptide bandupon staining the gel with a stain well-known in the art. For certainpurposes, higher resolution may be provided by using HPLC or other meanswell-known in the art for purification.

[0084] The term “polypeptide fragment” as used herein refers to apolypeptide of the instant invention that has an amino-terminal and/orcarboxy-terminal deletion compared to a full-length polypeptide. In apreferred embodiment, the polypeptide fragment is a contiguous sequencein which the amino acid sequence of the fragment is identical to thecorresponding positions in the naturally-occurring sequence. Fragmentstypically are at least 5, 6, 7, 8, 9 or 10 amino acids long, preferablyat least 12, 14, 16 or 18 amino acids long, more preferably at least 20amino acids long, more preferably at least 25, 30, 35, 40 or 45, aminoacids, even more preferably at least 50 or 60 amino acids long, and evenmore preferably at least 70 amino acids long.

[0085] A “derivative” refers to polypeptides or fragments thereof thatare substantially similar in primary structural sequence but whichinclude, e.g., in vivo or in vitro chemical and biochemicalmodifications that are not found in the native polypeptide. Suchmodifications include, for example, acetylation, acylation,ADP-ribosylation, amidation, covalent attachment of flavin, covalentattachment of a heme moiety, covalent attachment of a nucleotide ornucleotide derivative, covalent attachment of a lipid or lipidderivative, covalent attachment of phosphotidylinositol, cross-linking,cyclization, disulfide bond formation, demethylation, formation ofcovalent cross-links, formation of cystine, formation of pyroglutamate,formylation, gamma-carboxylation, glycosylation, GPI anchor formation,hydroxylation, iodination, methylation, myristoylation, oxidation,proteolytic processing, phosphorylation, prenylation, racemization,selenoylation, sulfation, transfer-RNA mediated addition of amino acidsto proteins such as arginylation, and ubiquitination. Other modificationinclude, e.g., labeling with radionuclides, and various enzymaticmodifications, as will be readily appreciated by those skilled in theart. A variety of methods for labeling polypeptides and of substituentsor labels useful for such purposes are well-known in the art, andinclude radioactive isotopes such as ¹²⁵I, ³²P, ³⁵S, and ³H, ligandswhich bind to labeled antiligands (e.g., antibodies), fluorophores,chemiluminescent agents, enzymes, and antiligands which can serve asspecific binding pair members for a labeled ligand. The choice of labeldepends on the sensitivity required, ease of conjugation with theprimer, stability requirements, and available instrumentation. Methodsfor labeling polypeptides are well-known in the art. See Ausubel (1992),supra; Ausubel (1999), supra, herein incorporated by reference.

[0086] The term “fusion protein” refers to polypeptides of the instantinvention comprising polypeptides or fragments coupled to heterologousamino acid sequences. Fusion proteins are useful because they can beconstructed to contain two or more desired functional elements from twoor more different proteins. A fusion protein comprises at least 10contiguous amino acids from a polypeptide of interest, more preferablyat least 20 or 30 amino acids, even more preferably at least 40, 50 or60 amino acids, yet more preferably at least 75, 100 or 125 amino acids.Fusion proteins can be produced recombinantly by constructing a nucleicacid sequence which encodes the polypeptide or a fragment thereof inframe with a nucleic acid sequence encoding a different protein orpeptide and then expressing the fusion protein. Alternatively, a fusionprotein can be produced chemically by crosslinking the polypeptide or afragment thereof to another protein.

[0087] The term “analog” refers to both polypeptide analogs andnon-peptide analogs. The term “polypeptide analog” as used herein refersto a polypeptide of the instant invention that is comprised of a segmentof at least 25 amino acids that has substantial identity to a portion ofan amino acid sequence but which contains non-natural amino acids ornon-natural inter-residue bonds. In a preferred embodiment, the analoghas the same or similar biological activity as the native polypeptide.Typically, polypeptide analogs comprise a conservative amino acidsubstitution (or insertion or deletion) with respect to thenaturally-occurring sequence. Analogs typically are at least 20 aminoacids long, preferably at least 50 amino acids long or longer, and canoften be as long as a full-length naturally-occurring polypeptide.

[0088] The term “non-peptide analog” refers to a compound withproperties that are analogous to those of a reference polypeptide of theinstant invention. A non-peptide compound may also be termed a “peptidemimetic” or a “peptidomimetic.” Such compounds are often developed withthe aid of computerized molecular modeling. Peptide mimetics that arestructurally similar to useful peptides may be used to produce anequivalent effect. Generally, peptidomimetics are structurally similarto a paradigm polypeptide (i.e., a polypeptide that has a desiredbiochemical property or pharmacological activity), but have one or morepeptide linkages optionally replaced by a linkage selected from thegroup consisting of: —CH₂NH—, —CH₂S—, —CH₂—CH₂—, —CH═CH-(cis and trans),—COCH₂—, —CH(OH)CH₂—, and —CH₂SO—, by methods well-known in the art.Systematic substitution of one or more amino acids of a consensussequence with a D-amino acid of the same type (e.g., D-lysine in placeof L-lysine) may also be used to generate more stable peptides. Inaddition, constrained peptides comprising a consensus sequence or asubstantially identical consensus sequence variation may be generated bymethods known in the art (Rizo et al., Ann. Rev. Biochem. 61:387-418(1992), incorporated herein by reference). For example, one may addinternal cysteine residues capable of forming intramolecular disulfidebridges which cyclize the peptide.

[0089] A “polypeptide mutant” or “mutein” refers to a polypeptide of theinstant invention whose sequence contains substitutions, insertions ordeletions of one or more amino acids compared to the amino acid sequenceof a native or wild-type protein. A mutein may have one or more aminoacid point substitutions, in which a single amino acid at a position hasbeen changed to another amino acid, one or more insertions and/ordeletions, in which one or more amino acids are inserted or deleted,respectively, in the sequence of the naturally-occurring protein, and/ortruncations of the amino acid sequence at either or both the amino orcarboxy termini. Further, a mutein may have the same or differentbiological activity as the naturally-occurring protein. For instance, amutein may have an increased or decreased biological activity. A muteinhas at least 50% sequence similarity to the wild type protein, preferredis 60% sequence similarity, more preferred is 70% sequence similarity.Even more preferred are muteins having 80%, 85% or 90% sequencesimilarity to the wild type protein. In an even more preferredembodiment, a mutein exhibits 95% sequence identity, even morepreferably 97%, even more preferably 98% and even more preferably 99%.Sequence similarity may be measured by any common sequence analysisalgorithm, such as Gap or Bestfit.

[0090] Preferred amino acid substitutions are those which: (1) reducesusceptibility to proteolysis, (2) reduce susceptibility to oxidation,(3) alter binding affinity for forming protein complexes, (4) alterbinding affinity or enzymatic activity, and (5) confer or modify otherphysicochemical or functional properties of such analogs. For example,single or multiple amino acid substitutions (preferably conservativeamino acid substitutions) may be made in the naturally-occurringsequence (preferably in the portion of the polypeptide outside thedomain(s) forming intermolecular contacts. In a preferred embodiment,the amino acid substitutions are moderately conservative substitutionsor conservative substitutions. In a more preferred embodiment, the aminoacid substitutions are conservative substitutions. A conservative aminoacid substitution should not substantially change the structuralcharacteristics of the parent sequence (e.g., a replacement amino acidshould not tend to disrupt a helix that occurs in the parent sequence,or disrupt other types of secondary structure that characterizes theparent sequence). Examples of art-recognized polypeptide secondary andtertiary structures are described in Creighton (ed.), Proteins,Structures and Molecular Principles, W. H. Freeman and Company (1984);Branden et al. (ed.), Introduction to Protein Structure, GarlandPublishing (1991); Thornton et al., Nature 354:105-106 (1991), each ofwhich are incorporated herein by reference.

[0091] As used herein, the twenty conventional amino acids and theirabbreviations follow conventional usage. See Golub et al. (eds.),Immunology—A Synthesis 2^(nd) Ed., Sinauer Associates (1991), which isincorporated herein by reference. Stereoisomers (e.g., D-amino acids) ofthe twenty conventional amino acids, unnatural amino acids such as -,-disubstituted amino acids, N-alkyl amino acids, and otherunconventional amino acids may also be suitable components forpolypeptides of the present invention. Examples of unconventional aminoacids include: 4-hydroxyproline, γ-carboxyglutamate,-N,N,N-trimethyllysine, -N-acetyllysine, O-phosphoserine,N-acetylserine, N-formylmethionine, 3-methylhistidine, 5-hydroxylysine,s-N-methylarginine, and other similar amino acids and imino acids (e.g.,4-hydroxyproline). In the polypeptide notation used herein, the lefthanddirection is the amino terminal direction and the right hand directionis the carboxy-terminal direction, in accordance with standard usage andconvention.

[0092] A protein has “homology” or is “homologous” to a protein fromanother organism if the encoded amino acid sequence of the protein has asimilar sequence to the encoded amino acid sequence of a protein of adifferent organism and has a similar biological activity or function.Alternatively, a protein may have homology or be homologous to anotherprotein if the two proteins have similar amino acid sequences and havesimilar biological activities or functions. Although two proteins aresaid to be “homologous,” this does not imply that there is necessarilyan evolutionary relationship between the proteins. Instead, the term“homologous” is defined to mean that the two proteins have similar aminoacid sequences and similar biological activities or functions. In apreferred embodiment, a homologous protein is one that exhibits 50%sequence similarity to the wild type protein, preferred is 60% sequencesimilarity, more preferred is 70% sequence similarity. Even morepreferred are homologous proteins that exhibit 80%, 85% or 90% sequencesimilarity to the wild type protein. In a yet more preferred embodiment,a homologous protein exhibits 95%, 97%, 98% or 99% sequence similarity.

[0093] When “sequence similarity” is used in reference to proteins orpeptides, it is recognized that residue positions that are not identicaloften differ by conservative amino acid substitutions. In a preferredembodiment, a polypeptide that has “sequence similarity” comprisesconservative or moderately conservative amino acid substitutions. A“conservative amino acid substitution” is one in which an amino acidresidue is substituted by another amino acid residue having a side chain(R group) with similar chemical properties (e.g., charge orhydrophobicity). In general, a conservative amino acid substitution willnot substantially change the functional properties of a protein. Incases where two or more amino acid sequences differ from each other byconservative substitutions, the percent sequence identity or degree ofsimilarity may be adjusted upwards to correct for the conservativenature of the substitution. Means for making this adjustment arewell-known to those of skill in the art. See, e.g., Pearson, MethodsMol. Biol. 24: 307-31 (1994), herein incorporated by reference.

[0094] For instance, the following six groups each contain amino acidsthat are conservative substitutions for one another: 1) Serine (S),Threonine (T); 2) Aspartic Acid (D), Glutamic Acid (E); 3) Asparagine(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I),Leucine (L), Methionine (M), Alanine (A), Valine (V), and 6)Phenylalanine (F), Tyrosine (Y), Tryptophan (W).

[0095] Alternatively, a conservative replacement is any change having apositive value in the PAM250 log-likelihood matrix disclosed in Gonnetet al., Science 256: 1443-45 (1992), herein incorporated by reference. A“moderately conservative” replacement is any change having a nonnegativevalue in the PAM250 log-likelihood matrix.

[0096] Sequence similarity for polypeptides, which is also referred toas sequence identity, is typically measured using sequence analysissoftware. Protein analysis software matches similar sequences usingmeasures of similarity assigned to various substitutions, deletions andother modifications, including conservative amino acid substitutions.For instance, GCG contains programs such as “Gap” and “Bestfit” whichcan be used with default parameters to determine sequence homology orsequence identity between closely related polypeptides, such ashomologous polypeptides from different species of organisms or between awild type protein and a mutein thereof. See, e.g., GCG Version 6.1.Other programs include FASTA, discussed supra.

[0097] A preferred algorithm when comparing a sequence of the inventionto a database containing a large number of sequences from differentorganisms is the computer program BLAST, especially blastp or tblastn.See, e.g., Altschul et al., J. Mol. Biol. 215: 403-410 (1990); Altschulet al., Nucleic Acids Res. 25:3389-402 (1997); herein incorporated byreference. Preferred parameters for blastp are: Expectation value:  10(default) Filter: seg (default) Cost to open a gap:  11 (default) Costto extend a gap:  1 (default Max. alignments: 100 (default) Word size: 11 (default) No. of descriptions: 100 (default) Penalty Matrix:BLOSUM62

[0098] The length of polypeptide sequences compared for homology willgenerally be at least about 16 amino acid residues, usually at leastabout 20 residues, more usually at least about 24 residues, typically atleast about 28 residues, and preferably more than about 35 residues.When searching a database containing sequences from a large number ofdifferent organisms, it is preferable to compare amino acid sequences.

[0099] Database searching using amino acid sequences can be measured byalgorithms other than blastp are known in the art. For instance,polypeptide sequences can be compared using FASTA, a program in GCGVersion 6.1. FASTA (e.g., FASTA2 and FASTA3) provides alignments andpercent sequence identity of the regions of the best overlap between thequery and search sequences (Pearson (1990), supra; Pearson (2000),supra. For example, percent sequence identity between amino acidsequences can be determined using FASTA with its default or recommendedparameters (a word size of 2 and the PAM250 scoring matrix), as providedin GCG Version 6.1, herein incorporated by reference.

[0100] An “antibody” refers to an intact immunoglobulin, or to anantigen-binding portion thereof that competes with the intact antibodyfor specific binding to a molecular species, e.g., a polypeptide of theinstant invention. Antigen-binding portions may be produced byrecombinant DNA techniques or by enzymatic or chemical cleavage ofintact antibodies. Antigen-binding portions include, inter alia, Fab,Fab′, F(ab′)₂, Fv, dAb, and complementarity determining region (CDR)fragments, single-chain antibodies (scFv), chimeric antibodies,diabodies and polypeptides that contain at least a portion of animmunoglobulin that is sufficient to confer specific antigen binding tothe polypeptide. An Fab fragment is a monovalent fragment consisting ofthe VL, VH, CL and CH1 domains; an F(ab′)₂ fragment is a bivalentfragment comprising two Fab fragments linked by a disulfide bridge atthe hinge region; an Fd fragment consists of the VH and CH1 domains; anFv fragment consists of the VL and VH domains of a single arm of anantibody; and a dAb fragment consists of a VH domain. See, e.g., Ward etal., Nature 341: 544-546 (1989).

[0101] By “bind specifically” and “specific binding” is here intendedthe ability of the antibody to bind to a first molecular species inpreference to binding to other molecular species with which the antibodyand first molecular species are admixed. An antibody is saidspecifically to “recognize” a first molecular species when it can bindspecifically to that first molecular species.

[0102] A single-chain antibody (scFv) is an antibody in which a VL andVH region are paired to form a monovalent molecule via a syntheticlinker that enables them to be made as a single protein chain. See,e.g., Bird et al., Science 242: 423-426 (1988); Huston et al., Proc.Natl. Acad. Sci. USA 85: 5879-5883 (1988). Diabodies are bivalent,bispecific antibodies in which VH and VL domains are expressed on asingle polypeptide chain, but using a linker that is too short to allowfor pairing between the two domains on the same chain, thereby forcingthe domains to pair with complementary domains of another chain andcreating two antigen binding sites. See e.g., Holliger et al., Proc.Natl. Acad. Sci. USA 90: 6444-6448 (1993); Poljak et al., Structure 2:1121-1123 (1994). One or more CDRs may be incorporated into a moleculeeither covalently or noncovalently to make it an immunoadhesin. Animmunoadhesin may incorporate the CDR(s) as part of a larger polypeptidechain, may covalently link the CDR(s) to another polypeptide chain, ormay incorporate the CDR(s) noncovalently. The CDRs permit theimmunoadhesin to specifically bind to a particular antigen of interest.A chimeric antibody is an antibody that contains one or more regionsfrom one antibody and one or more regions from one or more otherantibodies.

[0103] An antibody may have one or more binding sites. If there is morethan one binding site, the binding sites may be identical to one anotheror may be different. For instance, a naturally-occurring immunoglobulinhas two identical binding sites, a single-chain antibody or Fab fragmenthas one binding site, while a “bispecific” or “bifunctional” antibodyhas two different binding sites.

[0104] An “isolated antibody” is an antibody that (1) is not associatedwith naturally-associated components, including othernaturally-associated antibodies, that accompany it in its native state,(2) is free of other proteins from the same species, (3) is expressed bya cell from a different species, or (4) does not occur in nature. It isknown that purified proteins, including purified antibodies, may bestabilized with non-naturally-associated components. Thenon-naturally-associated component may be a protein, such as albumin(e.g., BSA) or a chemical such as polyethylene glycol (PEG).

[0105] A “neutralizing antibody” or “an inhibitory antibody” is anantibody that inhibits the activity of a polypeptide or blocks thebinding of a polypeptide to a ligand that normally binds to it. An“activating antibody” is an antibody that increases the activity of apolypeptide.

[0106] The term “epitope” includes any protein determinant capable ofspecifically binding to an immunoglobulin or T-cell receptor. Epitopicdeterminants usually consist of chemically active surface groupings ofmolecules such as amino acids or sugar side chains and usually havespecific three-dimensional structural characteristics, as well asspecific charge characteristics. An antibody is said to specificallybind an antigen when the dissociation constant is less than 1 μM,preferably less than 100 nM and most preferably less than 10 nM.

[0107] The term “patient” as used herein includes human and veterinarysubjects.

[0108] Throughout this specification and claims, the word “comprise,” orvariations such as “comprises” or “comprising,” will be understood toimply the inclusion of a stated integer or group of integers but not theexclusion of any other integer or group of integers.

[0109] The term “breast specific” refers to a nucleic acid molecule orpolypeptide that is expressed predominantly in the breast as compared toother tissues in the body. In a preferred embodiment, a “breastspecific” nucleic acid molecule or polypeptide is expressed at a levelthat is 5-fold higher than any other tissue in the body. In a morepreferred embodiment, the “breast specific” nucleic acid molecule orpolypeptide is expressed at a level that is 10-fold higher than anyother tissue in the body, more preferably at least 15-fold, 20-fold,25-fold, 50-fold or 100-fold higher than any other tissue in the body.Nucleic acid molecule levels may be measured by nucleic acidhybridization, such as Northern blot hybridization, or quantitative PCR.Polypeptide levels may be measured by any method known to accuratelyquantitate protein levels, such as Western blot analysis.

[0110] Nucleic Acid Molecules, Regulatory Sequences, Vectors, Host Cellsand Recombinant Methods of Making Polyleptides

[0111] Nucleic Acid Molecules

[0112] One aspect of the invention provides isolated nucleic acidmolecules that are specific to the breast or to breast cells or tissueor that are derived from such nucleic acid molecules. These isolatedbreast specific nucleic acids (BSNAs) may comprise a cDNA, a genomicDNA, RNA, or a fragment of one of these nucleic acids, or may be anon-naturally-occurring nucleic acid molecule. In a preferredembodiment, the nucleic acid molecule encodes a polypeptide that isspecific to breast, a breast-specific polypeptide (BSP). In a morepreferred embodiment, the nucleic acid molecule encodes a polypeptidethat comprises an amino acid sequence of SEQ ID NO: 116 through 218. Inanother highly preferred embodiment, the nucleic acid molecule comprisesa nucleic acid sequence of SEQ ID NO: 1 through 115.

[0113] A BSNA may be derived from a human or from another animal. In apreferred embodiment, the BSNA is derived from a human or other mammal.In a more preferred embodiment, the BSNA is derived from a human orother primate. In an even more preferred embodiment, the BSNA is derivedfrom a human.

[0114] By “nucleic acid molecule” for purposes of the present invention,it is also meant to be inclusive of nucleic acid sequences thatselectively hybridize to a nucleic acid molecule encoding a BSNA or acomplement thereof. The hybridizing nucleic acid molecule may or may notencode a polypeptide or may not encode a BSP. However, in a preferredembodiment, the hybridizing nucleic acid molecule encodes a BSP. In amore preferred embodiment, the invention provides a nucleic acidmolecule that selectively hybridizes to a nucleic acid molecule thatencodes a polypeptide comprising an amino acid sequence of SEQ ID NO:116 through 218. In an even more preferred embodiment, the inventionprovides a nucleic acid molecule that selectively hybridizes to anucleic acid molecule comprising the nucleic acid sequence of SEQ ID NO:1 through 115.

[0115] In a preferred embodiment, the nucleic acid molecule selectivelyhybridizes to a nucleic acid molecule encoding a BSP under lowstringency conditions. In a more preferred embodiment, the nucleic acidmolecule selectively hybridizes to a nucleic acid molecule encoding aBSP under moderate stringency conditions. In a more preferredembodiment, the nucleic acid molecule selectively hybridizes to anucleic acid molecule encoding a BSP under high stringency conditions.In an even more preferred embodiment, the nucleic acid moleculehybridizes under low, moderate or high stringency conditions to anucleic acid molecule encoding a polypeptide comprising an amino acidsequence of SEQ ID NO: 116 through 218. In a yet more preferredembodiment, the nucleic acid molecule hybridizes under low, moderate orhigh stringency conditions to a nucleic acid molecule comprising anucleic acid sequence selected from SEQ ID NO: 1 through 115. In apreferred embodiment of the invention, the hybridizing nucleic acidmolecule may be used to express recombinantly a polypeptide of theinvention.

[0116] By “nucleic acid molecule” as used herein it is also meant to beinclusive of sequences that exhibits substantial sequence similarity toa nucleic acid encoding a BSP or a complement of the encoding nucleicacid molecule. In a preferred embodiment, the nucleic acid moleculeexhibits substantial sequence similarity to a nucleic acid moleculeencoding human BSP. In a more preferred embodiment, the nucleic acidmolecule exhibits substantial sequence similarity to a nucleic acidmolecule encoding a polypeptide having an amino acid sequence of SEQ IDNO: 116 through 218. In a preferred embodiment, the similar nucleic acidmolecule is one that has at least 60% sequence identity with a nucleicacid molecule encoding a BSP, such as a polypeptide having an amino acidsequence of SEQ ID NO: 116 through 218, more preferably at least 70%,even more preferably at least 80% and even more preferably at least 85%.In a more preferred embodiment, the similar nucleic acid molecule is onethat has at least 90% sequence identity with a nucleic acid moleculeencoding a BSP, more preferably at least 95%, more preferably at least97%, even more preferably at least 98%, and still more preferably atleast 99%. In another highly preferred embodiment, the nucleic acidmolecule is one that has at least 99.5%, 99.6%, 99.7%, 99.8% or 99.9%sequence identity with a nucleic acid molecule encoding a BSP.

[0117] In another preferred embodiment, the nucleic acid moleculeexhibits substantial sequence similarity to a BSNA or its complement. Ina more preferred embodiment, the nucleic acid molecule exhibitssubstantial sequence similarity to a nucleic acid molecule comprising anucleic acid sequence of SEQ ID NO: 1 through 115. In a preferredembodiment, the nucleic acid molecule is one that has at least 60%sequence identity with a BSNA, such as one having a nucleic acidsequence of SEQ ID NO: 1 through 115, more preferably at least 70%, evenmore preferably at least 80% and even more preferably at least 85%. In amore preferred embodiment, the nucleic acid molecule is one that has atleast 90% sequence identity with a BSNA, more preferably at least 95%,more preferably at least 97%, even more preferably at least 98%, andstill more preferably at least 99%. In another highly preferredembodiment, the nucleic acid molecule is one that has at least 99.5%,99.6%, 99.7%, 99.8% or 99.9% sequence identity with a BSNA.

[0118] A nucleic acid molecule that exhibits substantial sequencesimilarity may be one that exhibits sequence identity over its entirelength to a BSNA or to a nucleic acid molecule encoding a BSP, or may beone that is similar over only a part of its length. In this case, thepart is at least 50 nucleotides of the BSNA or the nucleic acid moleculeencoding a BSP, preferably at least 100 nucleotides, more preferably atleast 150 or 200 nucleotides, even more preferably at least 250 or 300nucleotides, still more preferably at least 400 or 500 nucleotides.

[0119] The substantially similar nucleic acid molecule may be anaturally-occurring one that is derived from another species, especiallyone derived from another primate, wherein the similar nucleic acidmolecule encodes an amino acid sequence that exhibits significantsequence identity to that of SEQ ID NO: 116 through 218 or demonstratessignificant sequence identity to the nucleotide sequence of SEQ ID NO: 1through 115. The similar nucleic acid molecule may also be anaturally-occurring nucleic acid molecule from a human, when the BSNA isa member of a gene family. The similar nucleic acid molecule may also bea naturally-occurring nucleic acid molecule derived from a non-primate,mammalian species, including without limitation, domesticated species,e.g., dog, cat, mouse, rat, rabbit, hamster, cow, horse and pig; andwild animals, e.g., monkey, fox, lions, tigers, bears, giraffes, zebras,etc. The substantially similar nucleic acid molecule may also be anaturally-occurring nucleic acid molecule derived from a non-mammalianspecies, such as birds or reptiles. The naturally-occurringsubstantially similar nucleic acid molecule may be isolated directlyfrom humans or other species. In another embodiment, the substantiallysimilar nucleic acid molecule may be one that is experimentally producedby random mutation of a nucleic acid molecule. In another embodiment,the substantially similar nucleic acid molecule may be one that isexperimentally produced by directed mutation of a BSNA. Further, thesubstantially similar nucleic acid molecule may or may not be a BSNA.However, in a preferred embodiment, the substantially similar nucleicacid molecule is a BSNA.

[0120] By “nucleic acid molecule” it is also meant to be inclusive ofallelic variants of a BSNA or a nucleic acid encoding a BSP. Forinstance, single nucleotide polymorphisms (SNPs) occur frequently ineukaryotic genomes. In fact, more than 1.4 million SNPs have alreadyidentified in the human genome, International Human Genome SequencingConsortium, Nature 409: 860-921 (2001). Thus, the sequence determinedfrom one individual of a species may differ from other allelic formspresent within the population. Additionally, small deletions andinsertions, rather than single nucleotide polymorphisms, are notuncommon in the general population, and often do not alter the functionof the protein. Further, amino acid substitutions occur frequently amongnatural allelic variants, and often do not substantially change proteinfunction.

[0121] In a preferred embodiment, the nucleic acid molecule comprisingan allelic variant is a variant of a gene, wherein the gene istranscribed into an mRNA that encodes a BSP. In a more preferredembodiment, the gene is transcribed into an mRNA that encodes a BSPcomprising an amino acid sequence of SEQ ID NO: 116 through 218. Inanother preferred embodiment, the allelic variant is a variant of agene, wherein the gene is transcribed into an mRNA that is a BSNA. In amore preferred embodiment, the gene is transcribed into an mRNA thatcomprises the nucleic acid sequence of SEQ ID NO: 1 through 115. In apreferred embodiment, the allelic variant is a naturally-occurringallelic variant in the species of interest. In a more preferredembodiment, the species of interest is human.

[0122] By “nucleic acid molecule” it is also meant to be inclusive of apart of a nucleic acid sequence of the instant invention. The part mayor may not encode a polypeptide, and may or may not encode a polypeptidethat is a BSP. However, in a preferred embodiment, the part encodes aBSP. In one aspect, the invention comprises a part of a BSNA. In asecond aspect, the invention comprises a part of a nucleic acid moleculethat hybridizes or exhibits substantial sequence similarity to a BSNA.In a third aspect, the invention comprises a part of a nucleic acidmolecule that is an allelic variant of a BSNA. In a fourth aspect, theinvention comprises a part of a nucleic acid molecule that encodes aBSP. A part comprises at least 10 nucleotides, more preferably at least15, 17, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250,300, 350, 400 or 500 nucleotides. The maximum size of a nucleic acidpart is one nucleotide shorter than the sequence of the nucleic acidmolecule encoding the full-length protein.

[0123] By “nucleic acid molecule” it is also meant to be inclusive ofsequence that encoding a fusion protein, a homologous protein, apolypeptide fragment, a mutein or a polypeptide analog, as describedbelow.

[0124] Nucleotide sequences of the instantly-described nucleic acidswere determined by sequencing a DNA molecule that had resulted, directlyor indirectly, from at least one enzymatic polymerization reaction(e.g., reverse transcription and/or polymerase chain reaction) using anautomated sequencer (such as the MegaBACE™ 1000, Molecular Dynamics,Sunnyvale, Calif., USA). Further, all amino acid sequences of thepolypeptides of the present invention were predicted by translation fromthe nucleic acid sequences so determined, unless otherwise specified.

[0125] In a preferred embodiment of the invention, the nucleic acidmolecule contains modifications of the native nucleic acid molecule.These modifications include nonnative internucleoside bonds,post-synthetic modifications or altered nucleotide analogues. One havingordinary skill in the art would recognize that the type of modificationthat can be made will depend upon the intended use of the nucleic acidmolecule. For instance, when the nucleic acid molecule is used as ahybridization probe, the range of such modifications will be limited tothose that permit sequence-discriminating base pairing of the resultingnucleic acid. When used to direct expression of RNA or protein in vitroor in vivo, the range of such modifications will be limited to thosethat permit the nucleic acid to function properly as a polymerizationsubstrate. When the isolated nucleic acid is used as a therapeuticagent, the modifications will be limited to those that do not confertoxicity upon the isolated nucleic acid.

[0126] In a preferred embodiment, isolated nucleic acid molecules caninclude nucleotide analogues that incorporate labels that are directlydetectable, such as radiolabels or fluorophores, or nucleotide analoguesthat incorporate labels that can be visualized in a subsequent reaction,such as biotin or various haptens. In a more preferred embodiment, thelabeled nucleic acid molecule may be used as a hybridization probe.

[0127] Common radiolabeled analogues include those labeled with ³³P,³²P, and ³⁵S, such as -³²P-dATP, -³²P-dCTP, -³²P-dGTP, -³²P-dTTP,-³²P-3′dATP, -³²P-ATP, -³²P-CTP, -³²P-GTP, -³²P-UTP, -³⁵S-dATP,α-³⁵S-GTP, α-³³P-dATP, and the like.

[0128] Commercially available fluorescent nucleotide analogues readilyincorporated into the nucleic acids of the present invention includeCy3-dCTP, Cy3-dUTP, Cy5-dCTP, Cy3-dUTP (Amersham Pharmacia Biotech,Piscataway, N.J., USA), fluorescein-12-dUTP,tetramethylrhodamine-6-dUTP, Texas Red®-5-dUTP, Cascade Blue®-7-dUTP,BODIPY® FL-14-dUTP, BODIPY® TMR-14-dUTP, BODIPY® TR-14-dUTP, RhodamineGreen™-5-dUTP, Oregon Green® 488-5-dUTP, Texas Red®-12-dUTP, BODIPY®630/650-14-dUTP, BODIPY® 650/665-14-dUTP, Alexa Fluor® 488-5-dUTP, AlexaFluor® 532-5-dUTP, Alexa Fluor® 568-5-dUTP, Alexa Fluor® 594-5-dUTP,Alexa Fluor® 546-14-dUTP, fluorescein-12-UTP,tetramethylrhodamine-6-UTP, Texas Red®-5-UTP, Cascade Blue®-7-UTP,BODIPY® FL-14-UTP, BODIPY® TMR-14-UTP, BODIPY® TR-14-UTP, RhodamineGreen™-5-UTP, Alexa Fluor® 488-5-UTP, Alexa Fluor® 546-14-UTP (MolecularProbes, Inc. Eugene, Oreg., USA). One may also custom synthesizenucleotides having other fluorophores. See Henegariu et al., NatureBiotechnol. 18: 345-348 (2000), the disclosure of which is incorporatedherein by reference in its entirety.

[0129] Haptens that are commonly conjugated to nucleotides forsubsequent labeling include biotin (biotin-11-dUTP, Molecular Probes,Inc., Eugene, Oreg., USA; biotin-21-UTP, biotin-21-dUTP, ClontechLaboratories, Inc., Palo Alto, Calif., USA), digoxigenin (DIG-11-dUTP,alkali labile, DIG-11-UTP, Roche Diagnostics Corp., Indianapolis, Ind.,USA), and dinitrophenyl (dinitrophenyl-11-dUTP, Molecular Probes, Inc.,Eugene, Oreg., USA).

[0130] Nucleic acid molecules can be labeled by incorporation of labelednucleotide analogues into the nucleic acid. Such analogues can beincorporated by enzymatic polymerization, such as by nick translation,random priming, polymerase chain reaction (PCR), terminal transferasetailing, and end-filling of overhangs, for DNA molecules, and in vitrotranscription driven, e.g., from phage promoters, such as T7, T3, andSP6, for RNA molecules. Commercial kits are readily available for eachsuch labeling approach. Analogues can also be incorporated duringautomated solid phase chemical synthesis. Labels can also beincorporated after nucleic acid synthesis, with the 5′ phosphate and 3′hydroxyl providing convenient sites for post-synthetic covalentattachment of detectable labels.

[0131] Other post-synthetic approaches also permit internal labeling ofnucleic acids. For example, fluorophores can be attached using acisplatin reagent that reacts with the N7 of guanine residues (and, to alesser extent, adenine bases) in DNA, RNA, and PNA to provide a stablecoordination complex between the nucleic acid and fluorophore label(Universal Linkage System) (available from Molecular Probes, Inc.,Eugene, Oreg., USA and Amersham Pharmacia Biotech, Piscataway, N.J.,USA); see Alers et al., Genes, Chromosomes & Cancer 25: 301-305 (1999);Jelsma et al., J. NIH Res. 5: 82 (1994); Van Belkum et al.,BioTechniques 16: 148-153 (1994), incorporated herein by reference. Asanother example, nucleic acids can be labeled using adisulfide-containing linker (FastTag™ Reagent, Vector Laboratories,Inc., Burlingame, Calif., USA) that is photo- or thermally-coupled tothe target nucleic acid using aryl azide chemistry; after reduction, afree thiol is available for coupling to a hapten, fluorophore, sugar,affinity ligand, or other marker.

[0132] One or more independent or interacting labels can be incorporatedinto the nucleic acid molecules of the present invention. For example,both a fluorophore and a moiety that in proximity thereto acts to quenchfluorescence can be included to report specific hybridization throughrelease of fluorescence quenching or to report exonucleotidic excision.See, e.g., Tyagi et al., Nature Biotechnol. 14: 303-308 (1996); Tyagi etal., Nature Biotechnol. 16: 49-53 (1998); Sokol et al., Proc. Natl.Acad. Sci. USA 95: 11538-11543 (1998); Kostrikis et al., Science 279:1228-1229 (1998); Marras et al., Genet. Anal. 14: 151-156 (1999); U. S.Pat. Nos. 5,846,726; 5,925,517; 5,925,517; 5,723,591 and 5,538,848;Holland et al., Proc. Natl. Acad. Sci. USA 88: 7276-7280 (1991); Heid etal., Genome Res. 6(10): 986-94 (1996); Kuimelis et al., Nucleic AcidsSymp. Ser. (37): 255-6 (1997); the disclosures of which are incorporatedherein by reference in their entireties.

[0133] Nucleic acid molecules of the invention may be modified byaltering one or more native phosphodiester internucleoside bonds to morenuclease-resistant, internucleoside bonds. See Hartmann et al. (eds.),Manual of Antisense Methodology: Perspectives in Antisense Science,Kluwer Law International (1999); Stein et al. (eds.), Applied AntisenseOligonucleotide Technology, Wiley-Liss (1998); Chadwick et al. (eds.),Oligonucleotides as Therapeutic Agents—Symposium No. 209, John Wiley &Son Ltd (1997); the disclosures of which are incorporated herein byreference in their entireties. Such altered internucleoside bonds areoften desired for antisense techniques or for targeted gene correction.See Gamper et al., Nucl. Acids Res. 28(21): 4332-4339 (2000), thedisclosure of which is incorporated herein by reference in its entirety.

[0134] Modified oligonucleotide backbones include, without limitation,phosphorothioates, chiral phosphorothioates, phosphorodithioates,phosphotriesters, aminoalkylphosphotriesters, methyl and other alkylphosphonates including 3′-alkylene phosphonates and chiral phosphonates,phosphinates, phosphoramidates including 3′-amino phosphoramidate andaminoalkylphosphoramidates, thionophosphoramidates,thionoalkylphosphonates, thionoalkylphosphotriesters, andboranophosphates having normal 3′-5′ linkages, 2′-5′ linked analogs ofthese, and those having inverted polarity wherein the adjacent pairs ofnucleoside units are linked 3′-5′ to 5′-3′ or 2′-5′ to 5′-2′.Representative United States patents that teach the preparation of theabove phosphorus-containing linkages include, but are not limited to,U.S. Pat. Nos. 3,687,808; 4,469,863; 4,476,301; 5,023,243; 5,177,196;5,188,897; 5,264,423; 5,276,019; 5,278,302; 5,286,717; 5,321,131;5,399,676; 5,405,939; 5,453,496; 5,455,233; 5,466,677; 5,476,925;5,519,126; 5,536,821; 5,541,306; 5,550,111; 5,563,253; 5,571,799;5,587,361; and 5,625,050, the disclosures of which are incorporatedherein by reference in their entireties. In a preferred embodiment, themodified internucleoside linkages may be used for antisense techniques.

[0135] Other modified oligonucleotide backbones do not include aphosphorus atom, but have backbones that are formed by short chain alkylor cycloalkyl internucleoside linkages, mixed heteroatom and alkyl orcycloalkyl internucleoside linkages, or one or more short chainheteroatomic or heterocyclic internucleoside linkages. These includethose having morpholino linkages (formed in part from the sugar portionof a nucleoside); siloxane backbones; sulfide, sulfoxide and sulfonebackbones; formacetyl and thioformacetyl backbones; methylene formacetyland thioformacetyl backbones; alkene containing backbones; sulfamatebackbones; methyleneimino and methylenehydrazino backbones; sulfonateand sulfonamide backbones; amide backbones; and others having mixed N,O, S and CH₂ component parts. Representative U.S. patents that teach thepreparation of the above backbones include, but are not limited to, U.S.Pat. Nos. 5,034,506; 5,166,315; 5,185,444; 5,214,134; 5,216,141;5,235,033; 5,264,562; 5,264,564; 5,405,938; 5,434,257; 5,466,677;5,470,967; 5,489,677; 5,541,307; 5,561,225; 5,596,086; 5,602,240;5,610,289; 5,602,240; 5,608,046; 5,610,289; 5,618,704; 5,623,070;5,663,312; 5,633,360; 5,677,437 and 5,677,439; the disclosures of whichare incorporated herein by reference in their entireties.

[0136] In other preferred oligonucleotide mimetics, both the sugar andthe internucleoside linkage are replaced with novel groups, such aspeptide nucleic acids (PNA). In PNA compounds, the phosphodiesterbackbone of the nucleic acid is replaced with an amide-containingbackbone, in particular by repeating N-(2-aminoethyl)glycine unitslinked by amide bonds. Nucleobases are bound directly or indirectly toaza nitrogen atoms of the amide portion of the backbone, typically bymethylene carbonyl linkages. PNA can be synthesized using a modifiedpeptide synthesis protocol. PNA oligomers can be synthesized by bothFmoc and tBoc methods. Representative U.S. patents that teach thepreparation of PNA compounds include, but are not limited to, U.S Pat.Nos. 5,539,082; 5,714,331; and 5,719,262, each of which is hereinincorporated by reference. Automated PNA synthesis is readily achievableon commercial synthesizers (see, e.g., “PNA User's Guide,” Rev. 2,February 1998, Perseptive Biosystems Part No. 60138, Applied Biosystems,Inc., Foster City, Calif.).

[0137] PNA molecules are advantageous for a number of reasons. First,because the PNA backbone is uncharged, PNA/DNA and PNA/RNA duplexes havea higher thermal stability than is found in DNA/DNA and DNA/RNAduplexes. The Tm of a PNA/DNA or PNA/RNA duplex is generally 1° C.higher per base pair than the Tm of the corresponding DNA/DNA or DNA/RNAduplex (in 100 mM NaCl). Second, PNA molecules can also form stablePNA/DNA complexes at low ionic strength, under conditions in whichDNA/DNA duplex formation does not occur. Third, PNA also demonstratesgreater specificity in binding to complementary DNA because a PNA/DNAmismatch is more destabilizing than DNA/DNA mismatch. A single mismatchin mixed a PNA/DNA 15-mer lowers the Tm by 8-20° C. (15° C. on average).In the corresponding DNA/DNA duplexes, a single mismatch lowers the Tmby 4-16° C. (1 ° C. on average). Because PNA probes can be significantlyshorter than DNA probes, their specificity is greater. Fourth, PNAoligomers are resistant to degradation by enzymes, and the lifetime ofthese compounds is extended both in vivo and in vitro because nucleasesand proteases do not recognize the PNA polyamide backbone withnucleobase sidechains. See, e.g., Ray et al., FASEB J. 14(9): 1041-60(2000); Nielsen et al., Pharmacol Toxicol. 86(1): 3-7 (2000); Larsen etal., Biochim Biophys Acta. 1489(1): 159-66 (1999); Nielsen, Curr. Opin.Struct. Biol. 9(3): 353-7 (1999), and Nielsen, Curr. Opin. Biotechnol.10(1): 71-5 (1999), the disclosures of which are incorporated herein byreference in their entireties.

[0138] Nucleic acid molecules may be modified compared to their nativestructure throughout the length of the nucleic acid molecule or can belocalized to discrete portions thereof. As an example of the latter,chimeric nucleic acids can be synthesized that have discrete DNA and RNAdomains and that can be used for targeted gene repair and modified PCRreactions, as further described in U.S. Pat. Nos. 5,760,012 and5,731,181, Misra et al., Biochem. 37: 1917-1925 (1998); and Finn et al.,Nucl. Acids Res. 24: 3357-3363 (1996), the disclosures of which areincorporated herein by reference in their entireties.

[0139] Unless otherwise specified, nucleic acids of the presentinvention can include any topological conformation appropriate to thedesired use; the term thus explicitly comprehends, among others,single-stranded, double-stranded, triplexed, quadruplexed, partiallydouble-stranded, partially-triplexed, partially-quadruplexed, branched,hairpinned, circular, and padlocked conformations. Padlock conformationsand their utilities are further described in Banér et al., Curr. Opin.Biotechnol. 12: 11-15 (2001); Escude et al., Proc. Natl. Acad. Sci. USA14: 96(19):10603-7 (1999); Nilsson et al., Science 265(5181): 2085-8(1994), the disclosures of which are incorporated herein by reference intheir entireties. Triplex and quadruplex conformations, and theirutilities, are reviewed in Praseuth et al., Biochim. Biophys. Acta.1489(1): 181-206 (1999); Fox, Curr. Med. Chem. 7(1): 17-37 (2000);Kochetkova et al., Methods Mol. Biol. 130: 189-201 (2000); Chan et al.,J. Mol. Med. 75(4): 267-82 (1997), the disclosures of which areincorporated herein by reference in their entireties.

[0140] Methods for Using Nucleic Acid Molecules as Probes and Primers

[0141] The isolated nucleic acid molecules of the present invention canbe used as hybridization probes to detect, characterize, and quantifyhybridizing nucleic acids in, and isolate hybridizing nucleic acidsfrom, both genomic and transcript-derived nucleic acid samples. Whenfree in solution, such probes are typically, but not invariably,detectably labeled; bound to a substrate, as in a microarray, suchprobes are typically, but not invariably unlabeled.

[0142] In one embodiment, the isolated nucleic acids of the presentinvention can be used as probes to detect and characterize grossalterations in the gene of a BSNA, such as deletions, insertions,translocations, and duplications of the BSNA genomic locus throughfluorescence in situ hybridization (FISH) to chromosome spreads. See,e.g., Andreeff et al (eds.), Introduction to Fluorescence In SituHybridization: Principles and Clinical Applications, John Wiley & Sons(1999), the disclosure of which is incorporated herein by reference inits entirety. The isolated nucleic acids of the present invention can beused as probes to assess smaller genomic alterations using, e.g.,Southern blot detection of restriction fragment length polymorphisms.The isolated nucleic acid molecules of the present invention can be usedas probes to isolate genomic clones that include the nucleic acidmolecules of the present invention, which thereafter can be restrictionmapped and sequenced to identify deletions, insertions, translocations,and substitutions (single nucleotide polymorphisms, SNPs) at thesequence level.

[0143] In another embodiment, the isolated nucleic acid molecules of thepresent invention can be used as probes to detect, characterize, andquantify BSNA in, and isolate BSNA from, transcript-derived nucleic acidsamples. In one aspect, the isolated nucleic acid molecules of thepresent invention can be used as hybridization probes to detect,characterize by length, and quantify mRNA by Northern blot of total orpoly-A⁺-selected RNA samples. In another aspect, the isolated nucleicacid molecules of the present invention can be used as hybridizationprobes to detect, characterize by location, and quantify mRNA by in situhybridization to tissue sections. See, e.g., Schwarchzacher et al., InSitu Hybridization, Springer-Verlag New York (2000), the disclosure ofwhich is incorporated herein by reference in its entirety. In anotherpreferred embodiment, the isolated nucleic acid molecules of the presentinvention can be used as hybridization probes to measure therepresentation of clones in a cDNA library or to isolate hybridizingnucleic acid molecules acids from cDNA libraries, permitting sequencelevel characterization of mRNAs that hybridize to BSNAs, including,without limitations, identification of deletions, insertions,substitutions, truncations, alternatively spliced forms and singlenucleotide polymorphisms. In yet another preferred embodiment, thenucleic acid molecules of the instant invention may be used inmicroarrays.

[0144] All of the aforementioned probe techniques are well within theskill in the art, and are described at greater length in standard textssuch as Sambrook (2001), supra; Ausubel (1999), supra; and Walker et al.(eds.), The Nucleic Acids Protocols Handbook, Humana Press (2000), thedisclosures of which are incorporated herein by reference in theirentirety.

[0145] Thus, in one embodiment, a nucleic acid molecule of the inventionmay be used as a probe or primer to identify or amplify a second nucleicacid molecule that selectively hybridizes to the nucleic acid moleculeof the invention. In a preferred embodiment, the probe or primer isderived from a nucleic acid molecule encoding a BSP. In a more preferredembodiment, the probe or primer is derived from a nucleic acid moleculeencoding a polypeptide having an amino acid sequence of SEQ ID NO: 116through 218. In another preferred embodiment, the probe or primer isderived from a BSNA. In a more preferred embodiment, the probe or primeris derived from a nucleic acid molecule having a nucleotide sequence ofSEQ ID NO: 1 through 115.

[0146] In general, a probe or primer is at least 10 nucleotides inlength, more preferably at least 12, more preferably at least 14 andeven more preferably at least 16 or 17 nucleotides in length. In an evenmore preferred embodiment, the probe or primer is at least 18nucleotides in length, even more preferably at least 20 nucleotides andeven more preferably at least 22 nucleotides in length. Primers andprobes may also be longer in length. For instance, a probe or primer maybe 25 nucleotides in length, or may be 30, 40 or 50 nucleotides inlength. Methods of performing nucleic acid hybridization usingoligonucleotide probes are well-known in the art. See, e.g., Sambrook etal., 1989, supra, Chapter 11 and pp. 11.31-11.32 and 11.40-11.44, whichdescribes radiolabeling of short probes, and pp. 11.45-11.53, whichdescribe hybridization conditions for oligonucleotide probes, includingspecific conditions for probe hybridization (pp. 11.50-11.51).

[0147] Methods of performing primer-directed amplification are alsowell-known in the art. Methods for performing the polymerase chainreaction (PCR) are compiled, inter alia, in McPherson, PCR Basics: FromBackground to Bench, Springer Verlag (2000); Innis et al. (eds.), PCRApplications: Protocols for Functional Genomics, Academic Press (1999);Gelfand et al. (eds.), PCR Strategies, Academic Press (1998); Newton etal., PCR, Springer-Verlag New York (1997); Burke (ed.), PCR: EssentialTechniques, John Wiley & Son Ltd (1996); White (ed.), PCR CloningProtocols: From Molecular Cloning to Genetic Engineering, Vol. 67,Humana Press (1996); McPherson et al. (eds.), PCR 2: A PracticalApproach, Oxford University Press, Inc. (1995); the disclosures of whichare incorporated herein by reference in their entireties. Methods forperforming RT-PCR are collected, e.g., in Siebert et al. (eds.), GeneCloning and Analysis by RT-PCR, Eaton Publishing Company/Bio TechniquesBooks Division, 1998; Siebert (ed.), PCR Technique:RT-PCR, EatonPublishing Company/BioTechniques Books (1995); the disclosure of whichis incorporated herein by reference in its entirety.

[0148] PCR and hybridization methods may be used to identify and/orisolate allelic variants, homologous nucleic acid molecules andfragments of the nucleic acid molecules of the invention. PCR andhybridization methods may also be used to identify, amplify and/orisolate nucleic acid molecules that encode homologous proteins, analogs,fusion protein or muteins of the invention. The nucleic acid primers ofthe present invention can be used to prime amplification of nucleic acidmolecules of the invention, using transcript-derived or genomic DNA astemplate.

[0149] The nucleic acid primers of the present invention can also beused, for example, to prime single base extension (SBE) for SNPdetection (See, e.g., U.S. Pat. No. 6,004,744, the disclosure of whichis incorporated herein by reference in its entirety).

[0150] Isothermal amplification approaches, such as rolling circleamplification, are also now well-described. See, e.g., Schweitzer etal., Curr. Opin. Biotechnol. 12(1): 21-7 (2001); U.S. Pat. Nos.5,854,033 and 5,714,320; and international patent publications WO97/19193 and WO 00/15779, the disclosures of which are incorporatedherein by reference in their entireties. Rolling circle amplificationcan be combined with other techniques to facilitate SNP detection. See,e.g., Lizardi et al., Nature Genet. 19(3): 225-32 (1998).

[0151] Nucleic acid molecules of the present invention may be bound to asubstrate either covalently or noncovalently. The substrate can beporous or solid, planar or non-planar, unitary or distributed. The boundnucleic acid molecules may be used as hybridization probes, and may belabeled or unlabeled. In a preferred embodiment, the bound nucleic acidmolecules are unlabeled.

[0152] In one embodiment, the nucleic acid molecule of the presentinvention is bound to a porous substrate, e.g., a membrane, typicallycomprising nitrocellulose, nylon, or positively-charged derivatizednylon. The nucleic acid molecule of the present invention can be used todetect a hybridizing nucleic acid molecule that is present within alabeled nucleic acid sample, e.g., a sample of transcript-derivednucleic acids. In another embodiment, the nucleic acid molecule is boundto a solid substrate, including, without limitation, glass, amorphoussilicon, crystalline silicon or plastics. Examples of plastics include,without limitation, polymnethylacrylic, polyethylene, polypropylene,polyacrylate, polymethylmethacrylate, polyvinylchloride,polytetrafluoroethylene, polystyrene, polycarbonate, polyacetal,polysulfone, celluloseacetate, cellulosenitrate, nitrocellulose, ormixtures thereof. The solid substrate may be any shape, includingrectangular, disk-like and spherical. In a preferred embodiment, thesolid substrate is a microscope slide or slide-shaped substrate.

[0153] The nucleic acid molecule of the present invention can beattached covalently to a surface of the support substrate or applied toa derivatized surface in a chaotropic agent that facilitatesdenaturation and adherence by presumed noncovalent interactions, or somecombination thereof. The nucleic acid molecule of the present inventioncan be bound to a substrate to which a plurality of other nucleic acidsare concurrently bound, hybridization to each of the plurality of boundnucleic acids being separately detectable. At low density, e.g. on aporous membrane, these substrate-bound collections are typicallydenominated macroarrays; at higher density, typically on a solidsupport, such as glass, these substrate bound collections of pluralnucleic acids are colloquially termed microarrays. As used herein, theterm microarray includes arrays of all densities. It is, therefore,another aspect of the invention to provide microarrays that include thenucleic acids of the present invention.

[0154] Expression Vectors, Host Cells and Recombinant Methods ofProducing Polypeptides

[0155] Another aspect of the present invention relates to vectors thatcomprise one or more of the isolated nucleic acid molecules of thepresent invention, and host cells in which such vectors have beenintroduced.

[0156] The vectors can be used, inter alia, for propagating the nucleicacids of the present invention in host cells (cloning vectors), forshuttling the nucleic acids of the present invention between host cellsderived from disparate organisms (shuttle vectors), for inserting thenucleic acids of the present invention into host cell chromosomes(insertion vectors), for expressing sense or antisense RNA transcriptsof the nucleic acids of the present invention in vitro or within a hostcell, and for expressing polypeptides encoded by the nucleic acids ofthe present invention, alone or as fusions to heterologous polypeptides(expression vectors). Vectors of the present invention will often besuitable for several such uses.

[0157] Vectors are by now well-known in the art, and are described,inter alia, in Jones et al. (eds.), Vectors: Cloning Applications:Essential Techniques (Essential Techniques Series), John Wiley & SonLtd. (1998); Jones et al (eds.), Vectors: Expression Systems: EssentialTechniques (Essential Techniques Series), John Wiley & Son Ltd. (1998);Gacesa et al., Vectors: Essential Data, John Wiley & Sons Ltd. (1995);Cid-Arregui (eds.), Viral Vectors: Basic Science and Gene Therapy, EatonPublishing Co. (2000); Sambrook (2001), supra; Ausubel (1999), supra;the disclosures of which are incorporated herein by reference in theirentireties. Furthermore, an enormous variety of vectors are availablecommercially. Use of existing vectors and modifications thereof beingwell within the skill in the art, only basic features need be describedhere.

[0158] Nucleic acid sequences may be expressed by operatively linkingthem to an expression control sequence in an appropriate expressionvector and employing that expression vector to transform an appropriateunicellular host. Expression control sequences are sequences whichcontrol the transcription, post-transcriptional events and translationof nucleic acid sequences. Such operative linking of a nucleic sequenceof this invention to an expression control sequence, of course,includes, if not already part of the nucleic acid sequence, theprovision of a translation initiation codon, ATG or GTG, in the correctreading frame upstream of the nucleic acid sequence.

[0159] A wide variety of host/expression vector combinations may beemployed in expressing the nucleic acid sequences of this invention.Useful expression vectors, for example, may consist of segments ofchromosomal, non-chromosomal and synthetic nucleic acid sequences.

[0160] In one embodiment, prokaryotic cells may be used with anappropriate vector. Prokaryotic host cells are often used for cloningand expression. In a preferred embodiment, prokaryotic host cellsinclude E. coli, Pseudomonas, Bacillus and Streptomyces. In a preferredembodiment, bacterial host cells are used to express the nucleic acidmolecules of the instant invention. Useful expression vectors forbacterial hosts include bacterial plasmids, such as those from E. coli,Bacillus or Streptomyces, including pbluescript, pGEX-2T, pUC vectors,col E1, pCR1, pBR322, pMB9 and their derivatives, wider host rangeplasmids, such as RP4, phage DNAs, e.g., the numerous derivatives ofphage lambda, e.g., NM989, λGT10 and λGT11, and other phages, e.g., M13and filamentous single-stranded phage DNA. Where E. coli is used ashost, selectable markers are, analogously, chosen for selectivity ingram negative bacteria: e.g., typical markers confer resistance toantibiotics, such as ampicillin, tetracycline, chloramphenicol,kanamycin, streptomycin and zeocin; auxotrophic markers can also beused.

[0161] In other embodiments, eukaryotic host cells, such as yeast,insect, mammalian or plant cells, may be used. Yeast cells, typically S.cerevisiae, are useful for eukaryotic genetic studies, due to the easeof targeting genetic changes by homologous recombination and the abilityto easily complement genetic defects using recombinantly expressedproteins. Yeast cells are useful for identifying interacting proteincomponents, e.g. through use of a two-hybrid system. In a preferredembodiment, yeast cells are useful for protein expression. Vectors ofthe present invention for use in yeast will typically, but notinvariably, contain an origin of replication suitable for use in yeastand a selectable marker that is functional in yeast. Yeast vectorsinclude Yeast Integrating plasmids (e.g., YIp5) and Yeast Replicatingplasmids (the YRp and YEp series plasmids), Yeast Centromere plasmids(the YCp series plasmids), Yeast Artificial Chromosomes (YACs) which arebased on yeast linear plasmids, denoted YLp, pGPD-2, 2μ plasmids andderivatives thereof, and improved shuttle vectors such as thosedescribed in Gietz et al., Gene, 74: 527-34 (1988) (YIplac, YEplac andYCplac). Selectable markers in yeast vectors include a variety ofauxotrophic markers, the most common of which are (in Saccharomycescerevisiae) URA3, HIS3, LEU2, TRP1 and LYS2, which complement specificauxotrophic mutations, such as ura3-52, his3-D1, leu2-D1, trp1-D1 andlys2-201.

[0162] Insect cells are often chosen for high efficiency proteinexpression. Where the host cells are from Spodoptera frugiperda, e.g.,Sf9 and Sf21 cell lines, and expresSF™ cells (Protein Sciences Corp.,Meriden, Conn., USA)), the vector replicative strategy is typicallybased upon the baculovirus life cycle. Typically, baculovirus transfervectors are used to replace the wild-type AcMNPV polyhedrin gene with aheterologous gene of interest. Sequences that flank the polyhedrin genein the wild-type genome are positioned 5′ and 3′ of the expressioncassette on the transfer vectors. Following co-transfection with AcMNPVDNA, a homologous recombination event occurs between these sequencesresulting in a recombinant virus carrying the gene of interest and thepolyhedrin or p10 promoter. Selection can be based upon visual screeningfor lacZ fusion activity.

[0163] In another embodiment, the host cells may be mammalian cells,which are particularly useful for expression of proteins intended aspharmaceutical agents, and for screening of potential agonists andantagonists of a protein or a physiological pathway. Mammalian vectorsintended for autonomous extrachromosomal replication will typicallyinclude a viral origin, such as the SV40 origin (for replication in celllines expressing the large T-antigen, such as COS1 and COS7 cells), thepapillomavirus origin, or the EBV origin for long term episomalreplication (for use, e.g., in 293-EBNA cells, which constitutivelyexpress the EBV EBNA-1 gene product and adenovirus E1A). Vectorsintended for integration, and thus replication as part of the mammalianchromosome, can, but need not, include an origin of replicationfunctional in mammalian cells, such as the SV40 origin. Vectors basedupon viruses, such as adenovirus, adeno-associated virus, vacciniavirus, and various mammalian retroviruses, will typically replicateaccording to the viral replicative strategy. Selectable markers for usein mammalian cells include resistance to neomycin (G418), blasticidin,hygromycin and to zeocin, and selection based upon the purine salvagepathway using HAT medium.

[0164] Expression in mammalian cells can be achieved using a variety ofplasmids, including pSV2, pBC12BI, and p91023, as well as lytic virusvectors (e.g., vaccinia virus, adeno virus, and baculovirus), episomalvirus vectors (e.g., bovine papillomavirus), and retroviral vectors(e.g., murine retroviruses). Useful vectors for insect cells includebaculoviral vectors and pVL 941.

[0165] Plant cells can also be used for expression, with the vectorreplicon typically derived from a plant virus (e.g., cauliflower mosaicvirus, CaMV; tobacco mosaic virus, TMV) and selectable markers chosenfor suitability in plants.

[0166] It is known that codon usage of different host cells may bedifferent. For example, a plant cell and a human cell may exhibit adifference in codon preference for encoding a particular amino acid. Asa result, human mRNA may not be efficiently translated in a plant,bacteria or insect host cell. Therefore, another embodiment of thisinvention is directed to codon optimization. The codons of the nucleicacid molecules of the invention may be modified to resemble, as much aspossible, genes naturally contained within the host cell withoutaltering the amino acid sequence encoded by the nucleic acid molecule.

[0167] Any of a wide variety of expression control sequences may be usedin these vectors to express the DNA sequences of this invention. Suchuseful expression control sequences include the expression controlsequences associated with structural genes of the foregoing expressionvectors. Expression control sequences that control transcriptioninclude, e.g., promoters, enhancers and transcription termination sites.Expression control sequences in eukaryotic cells that controlpost-transcriptional events include splice donor and acceptor sites andsequences that modify the half-life of the transcribed RNA, e.g.,sequences that direct poly(A) addition or binding sites for RNA-bindingproteins. Expression control sequences that control translation includeribosome binding sites, sequences which direct targeted expression ofthe polypeptide to or within particular cellular compartments, andsequences in the 5′ and 3′ untranslated regions that modify the rate orefficiency of translation.

[0168] Examples of useful expression control sequences for a prokaryote,e.g., E. coli, will include a promoter, often a phage promoter, such asphage lambda pL promoter, the trc promoter, a hybrid derived from thetip and lac promoters, the bacteriophage T7 promoter (in E. coli cellsengineered to express the T7 polymerase), the TAC or TRC system, themajor operator and promoter regions of phage lambda, the control regionsof fd coat protein, or the araBAD operon. Prokaryotic expression vectorsmay further include transcription terminators, such as the aspAterminator, and elements that facilitate translation, such as aconsensus ribosome binding site and translation termination codon,Schomer et al., Proc. Natl. Acad. Sci. USA 83: 8506-8510 (1986).

[0169] Expression control sequences for yeast cells, typically S.cerevisiae, will include a yeast promoter, such as the CYC1 promoter,the GAL1 promoter, the GAL10 promoter, ADH1 promoter, the promoters ofthe yeast_-mating system, or the GPD promoter, and will typically haveelements that facilitate transcription termination, such as thetranscription termination signals from the CYC1 or ADH1 gene.

[0170] Expression vectors useful for expressing proteins in mammaliancells will include a promoter active in mammalian cells. These promotersinclude those derived from mammalian viruses, such as theenhancer-promoter sequences from the immediate early gene of the humancytomegalovirus (CMV), the enhancer-promoter sequences from the Roussarcoma virus long terminal repeat (RSV LTR), the enhancer-promoter fromSV40 or the early and late promoters of adenovirus. Other expressioncontrol sequences include the promoter for 3-phosphoglycerate kinase orother glycolytic enzymes, the promoters of acid phosphatase. Otherexpression control sequences include those from the gene comprising theBSNA of interest. Often, expression is enhanced by incorporation ofpolyadenylation sites, such as the late SV40 polyadenylation site andthe polyadenylation signal and transcription termination sequences fromthe bovine growth hormone (BGH) gene, and ribosome binding sites.Furthermore, vectors can include introns, such as intron II of rabbitβ-globin gene and the SV40 splice elements.

[0171] Preferred nucleic acid vectors also include a selectable oramplifiable marker gene and means for amplifying the copy number of thegene of interest. Such marker genes are well-known in the art. Nucleicacid vectors may also comprise stabilizing sequences (e.g., ori- orARS-like sequences and telomere-like sequences), or may alternatively bedesigned to favor directed or non-directed integration into the hostcell genome. In a preferred embodiment, nucleic acid sequences of thisinvention are inserted in frame into an expression vector that allowshigh level expression of an RNA which encodes a protein comprising theencoded nucleic acid sequence of interest. Nucleic acid cloning andsequencing methods are well-known to those of skill in the art and aredescribed in an assortment of laboratory manuals, including Sambrook(1989), supra, Sambrook (2000), supra; and Ausubel (1992), supra,Ausubel (1999), supra. Product information from manufacturers ofbiological, chemical and immunological reagents also provide usefulinformation.

[0172] Expression vectors may be either constitutive or inducible.Inducible vectors include either naturally inducible promoters, such asthe trc promoter, which is regulated by the lac operon, and the pLpromoter, which is regulated by tryptophan, the MMTV-LTR promoter, whichis inducible by dexamethasone, or can contain synthetic promoters and/oradditional elements that confer inducible control on adjacent promoters.Examples of inducible synthetic promoters are the hybrid Plac/ara-1promoter and the PLtetO-1 promoter. The PltetO-1 promoter takesadvantage of the high expression levels from the PL promoter of phagelambda, but replaces the lambda repressor sites with two copies ofoperator 2 of the Tn10 tetracycline resistance operon, causing thispromoter to be tightly repressed by the Tet repressor protein andinduced in response to tetracycline (Tc) and Tc derivatives such asanhydrotetracycline. Vectors may also be inducible because they containhormone response elements, such as the glucocorticoid response element(GRE) and the estrogen response element (ERE), which can confer hormoneinducibility where vectors are used for expression in cells having therespective hormone receptors. To reduce background levels of expression,elements responsive to ecdysone, an insect hormone, can be used instead,with coexpression of the ecdysone receptor.

[0173] In one aspect of the invention, expression vectors can bedesigned to fuse the expressed polypeptide to small protein tags thatfacilitate purification and/or visualization. Tags that facilitatepurification include a polyhistidine tag that facilitates purificationof the fusion protein by immobilized metal affinity chromatography, forexample using NiNTA resin (Qiagen Inc., Valencia, Calif., USA) or TALON™resin (cobalt immobilized affinity chromatography medium, Clontech Labs,Palo Alto, Calif., USA). The fusion protein can include a chitin-bindingtag and self-excising intein, permitting chitin-based purification withself-removal of the fused tag (IMPACT™ system, New England Biolabs,Inc., Beverley, Mass., USA). Alternatively, the fusion protein caninclude a calmodulin-binding peptide tag, permitting purification bycalmodulin affinity resin (Stratagene, La Jolla, Calif., USA), or aspecifically excisable fragment of the biotin carboxylase carrierprotein, permitting purification of in vivo biotinylated protein usingan avidin resin and subsequent tag removal (Promega, Madison, Wis.,USA). As another useful alternative, the proteins of the presentinvention can be expressed as a fusion protein withglutathione-S-transferase, the affinity and specificity of binding toglutathione permitting purification using glutathione affinity resins,such as Glutathione-Superflow Resin (Clontech Laboratories, Palo Alto,Calif., USA), with subsequent elution with free glutathione. Other tagsinclude, for example, the Xpress epitope, detectable by anti-Xpressantibody (Invitrogen, Carlsbad, Calif., USA), a myc tag, detectable byanti-myc tag antibody, the V5 epitope, detectable by anti-V5 antibody(Invitrogen, Carlsbad, Calif., USA), FLAG® epitope, detectable byanti-FLAG® antibody (Stratagene, La Jolla, Calif., USA), and the HAepitope.

[0174] For secretion of expressed proteins, vectors can includeappropriate sequences that encode secretion signals, such as leaderpeptides. For example, the pSecTag2 vectors (Invitrogen, Carlsbad,Calif., USA) are 5.2 kb mammalian expression vectors that carry thesecretion signal from the V-J2-C region of the mouse Ig kappa-chain forefficient secretion of recombinant proteins from a variety of mammaliancell lines.

[0175] Expression vectors can also be designed to fuse proteins encodedby the heterologous nucleic acid insert to polypeptides that are largerthan purification and/or identification tags. Useful fusion proteinsinclude those that permit display of the encoded protein on the surfaceof a phage or cell, fusion to intrinsically fluorescent proteins, suchas those that have a green fluorescent protein (GFP)-like chromophore,fusions to the IgG Fc region, and fusion proteins for use in two hybridsystems.

[0176] Vectors for phage display fuse the encoded polypeptide to, e.g.,the gene III protein (pIII) or gene VIII protein (pVIII) for display onthe surface of filamentous phage, such as M13. See Barbas et al., PhageDisplay: A Laboratory Manual, Cold Spring Harbor Laboratory Press(2001); Kay et al. (eds.), Phage Display of Peptides and Proteins: ALaboratory Manual, Academic Press, Inc., (1996); Abelson et al. (eds.),Combinatorial Chemistry (Methods in Enzymology, Vol. 267) Academic Press(1996). Vectors for yeast display, e.g. the pYD1 yeast display vector(Invitrogen, Carlsbad, Calif., USA), use the -agglutinin yeast adhesionreceptor to display recombinant protein on the surface of S. cerevisiae.Vectors for mammalian display, e.g., the pDisplay™ vector (Invitrogen,Carlsbad, Calif., USA), target recombinant proteins using an N-terminalcell surface targeting signal and a C-terminal transmembrane anchoringdomain of platelet derived growth factor receptor.

[0177] A wide variety of vectors now exist that fuse proteins encoded byheterologous nucleic acids to the chromophore of thesubstrate-independent, intrinsically fluorescent green fluorescentprotein from Aequorea victoria (“GFP”) and its variants. The GFP-likechromophore can be selected from GFP-like chromophores found innaturally occurring proteins, such as A. victoria GFP (GenBank accessionnumber AAA27721), Renilla reniformis GFP, FP583 (GenBank accession no.AF168419) (DsRed), FP593 (AF27271 1), FP483 (AF168420), FP484(AF168424), FP595 (AF246709), FP486 (AF168421), FP538 (AF168423), andFP506 (AF168422), and need include only so much of the native protein asis needed to retain the chromophore's intrinsic fluorescence. Methodsfor determining the minimal domain required for fluorescence are knownin the art. See Li et al., J. Biol. Chem. 272: 28545-28549 (1997).Alternatively, the GFP-like chromophore can be selected from GFP-likechromophores modified from those found in nature. The methods forengineering such modified GFP-like chromophores and testing them forfluorescence activity, both alone and as part of protein fusions, arewell-known in the art. See Heim et al., Curr. Biol. 6: 178-182 (1996)and Palm et al., Methods Enzymol. 302: 378-394 (1999), incorporatedherein by reference in its entirety. A variety of such modifiedchromophores are now commercially available and can readily be used inthe fusion proteins of the present invention. These include EGFP(“enhanced GFP”), EBFP (“enhanced blue fluorescent protein”), BFP2, EYFP(“enhanced yellow fluorescent protein”), ECFP (“enhanced cyanfluorescent protein”) or Citrine. EGFP (see, e.g, Cormack et al., Gene173: 33-38 (1996); U.S. Pat. Nos. 6,090,919 and 5,804,387) is found on avariety of vectors, both plasmid and viral, which are availablecommercially (Clontech Labs, Palo Alto, Calif., USA); EBFP is optimizedfor expression in mammalian cells whereas BFP2, which retains theoriginal jellyfish codons, can be expressed in bacteria (see, e.g,. Heimet al., Curr. Biol. 6: 178-182 (1996) and Cormack et al., Gene 173:33-38 (1996)). Vectors containing these blue-shifted variants areavailable from Clontech Labs (Palo Alto, Calif., USA). Vectorscontaining EYFP, ECFP (see, e.g., Heim et al, Curr. Biol. 6: 178-182(1996); Miyawaki et al., Nature 388: 882-887 (1997)) and Citrine (see,e.g., Heikal et al., Proc. Natl. Acad. Sci. USA 97: 11996-12001 (2000))are also available from Clontech Labs. The GFP-like chromophore can alsobe drawn from other modified GFPs, including those described in U.S.Pat. Nos. 6,124,128; 6,096,865; 6,090,919; 6,066,476; 6,054,321;6,027,881; 5,968,750; 5,874,304; 5,804,387; 5,777,079; 5,741,668; and5,625,048, the disclosures of which are incorporated herein by referencein their entireties. See also Conn (ed.), Green Fluorescent Protein(Methods in Enzymology, Vol. 302), Academic Press, Inc. (1999). TheGFP-like chromophore of each of these GFP variants can usefully beincluded in the fusion proteins of the present invention.

[0178] Fusions to the IgG Fc region increase serum half life of proteinpharmaceutical products through interaction with the FcRn receptor (alsodenominated the FcRp receptor and the Brambell receptor, FcRb), furtherdescribed in International Patent Application Nos. WO 97/43316, WO97/34631, WO 96/32478, WO 96/18412.

[0179] For long-term, high-yield recombinant production of the proteins,protein fusions, and protein fragments of the present invention, stableexpression is preferred. Stable expression is readily achieved byintegration into the host cell genome of vectors having selectablemarkers, followed by selection of these integrants. Vectors such aspUB6/V5-His A, B, and C (Invitrogen, Carlsbad, Calif., USA) are designedfor high-level stable expression of heterologous proteins in a widerange of mammalian tissue types and cell lines. pUB6/V5-His uses thepromoter/enhancer sequence from the human ubiquitin C gene to driveexpression of recombinant proteins: expression levels in 293, CHO, andNIH3T3 cells are comparable to levels from the CMV and human EF-1apromoters. The bsd gene permits rapid selection of stably transfectedmammalian cells with the potent antibiotic blasticidin.

[0180] Replication incompetent retroviral vectors, typically derivedfrom Moloney murine leukemia virus, also are useful for creating stabletransfectants having integrated provirus. The highly efficienttransduction machinery of retroviruses, coupled with the availability ofa variety of packaging cell lines such as RetroPack™ PT 67,EcoPack2™-293, AmphoPack-293, and GP2-293 cell lines (all available fromClontech Laboratories, Palo Alto, Calif., USA), allow a wide host rangeto be infected with high efficiency; varying the multiplicity ofinfection readily adjusts the copy number of the integrated provirus.

[0181] Of course, not all vectors and expression control sequences willfunction equally well to express the nucleic acid sequences of thisinvention. Neither will all hosts function equally well with the sameexpression system. However, one of skill in the art may make a selectionamong these vectors, expression control sequences and hosts withoutundue experimentation and without departing from the scope of thisinvention. For example, in selecting a vector, the host must beconsidered because the vector must be replicated in it. The vector'scopy number, the ability to control that copy number, the ability tocontrol integration, if any, and the expression of any other proteinsencoded by the vector, such as antibiotic or other selection markers,should also be considered. The present invention further includes hostcells comprising the vectors of the present invention, either presentepisomally within the cell or integrated, in whole or in part, into thehost cell chromosome. Among other considerations, some of which aredescribed above, a host cell strain may be chosen for its ability toprocess the expressed protein in the desired fashion. Suchpost-translational modifications of the polypeptide include, but are notlimited to, acetylation, carboxylation, glycosylation, phosphorylation,lipidation, and acylation, and it is an aspect of the present inventionto provide BSPs with such post-translational modifications.

[0182] Polypeptides of the invention may be post-translationallymodified. Post-translational modifications include phosphorylation ofamino acid residues serine, threonine and/or tyrosine, N-linked and/orO-linked glycosylation, methylation, acetylation, prenylation,methylation, acetylation, arginylation, ubiquination and racemization.One may determine whether a polypeptide of the invention is likely to bepost-translationally modified by analyzing the sequence of thepolypeptide to determine if there are peptide motifs indicative of sitesfor post-translational modification. There are a number of computerprograms that permit prediction of post-translational modifications.See, e.g., www.expasy.org (accessed Aug. 31, 2001), which includesPSORT, for prediction of protein sorting signals and localization sites,SignalP, for prediction of signal peptide cleavage sites, MITOPROT andPredotar, for prediction of mitochondrial targeting sequences, NetOGlyc,for prediction of type O-glycosylation sites in mammalian proteins,big-PI Predictor and DGPI, for prediction of prenylation-anchor andcleavage sites, and NetPhos, for prediction of Ser, Thr and Tyrphosphorylation sites in eukaryotic proteins. Other computer programs,such as those included in GCG, also may be used to determinepost-translational modification peptide motifs.

[0183] General examples of types of post-translational modifications maybe found in web sites such as the Delta Mass databasehttp://www.abrf.org/ABRF/Research Committees/deltamass/deltamass.html(accessed Oct. 19, 2001); “GlycoSuiteDB: a new curated relationaldatabase of glycoprotein glycan structures and their biological sources”Cooper et al. Nucleic Acids Res. 29; 332-335 (2001) andhttp://www.glycosuite.com/ (accessed Oct. 19, 2001); “O-GLYCBASE version4.0: a revised database of O-glycosylated proteins” Gupta et al. NucleicAcids Research, 27: 370-372 (1999) andhttp://www.cbs.dtu.dk/databases/OGLYCBASE/ (accessed Oct. 19, 2001);“PhosphoBase, a database of phosphorylation sites: release 2.0.”,Kreegipuu et al. Nucleic Acids Res 27(1):237-239 (1999) andhttp://www.cbs.dtu.dk/databases/PhosphoBase/ (accessed Oct. 19, 2001);or http://pir.georgetown.edu/pirwww/search/textresid.html (accessed Oct.19, 2001).

[0184] Tumorigenesis is often accompanied by alterations in thepost-translational modifications of proteins. Thus, in anotherembodiment, the invention provides polypeptides from cancerous cells ortissues that have altered post-translational modifications compared tothe post-translational modifications of polypeptides from normal cellsor tissues. A number of altered post-translational modifications areknown. One common alteration is a change in phosphorylation state,wherein the polypeptide from the cancerous cell or tissue ishyperphosphorylated or hypophosphorylated compared to the polypeptidefrom a normal tissue, or wherein the polypeptide is phosphorylated ondifferent residues than the polypeptide from a normal cell. Anothercommon alteration is a change in glycosylation state, wherein thepolypeptide from the cancerous cell or tissue has more or lessglycosylation than the polypeptide from a normal tissue, and/or whereinthe polypeptide from the cancerous cell or tissue has a different typeof glycosylation than the polypeptide from a noncancerous cell ortissue. Changes in glycosylation may be critical becausecarbohydrate-protein and carbohydrate-carbohydrate interactions areimportant in cancer cell progression, dissemination and invasion. See,e.g., Barchi, Curr. Pharm. Des. 6: 485-501 (2000), Verma, CancerBiochem. Biophys. 14: 151-162 (1994) and Dennis et al., Bioessays 5:412-421 (1999).

[0185] Another post-translational modification that may be altered incancer cells is prenylation. Prenylation is the covalent attachment of ahydrophobic prenyl group (either farnesyl or geranylgeranyl) to apolypeptide. Prenylation is required for localizing a protein to a cellmembrane and is often required for polypeptide function. For instance,the Ras superfamily of GTPase signaling proteins must be prenylated forfunction in a cell. See, e.g., Prendergast et al., Semin. Cancer Biol.10: 443-452 (2000) and Khwaja et al., Lancet 355: 741-744 (2000).

[0186] Other post-translation modifications that may be altered incancer cells include, without limitation, polypeptide methylation,acetylation, arginylation or racemization of amino acid residues. Inthese cases, the polypeptide from the cancerous cell may exhibit eitherincreased or decreased amounts of the post-translational modificationcompared to the corresponding polypeptides from noncancerous cells.

[0187] Other polypeptide alterations in cancer cells include abnormalpolypeptide cleavage of proteins and aberrant protein-proteininteractions. Abnormal polypeptide cleavage may be cleavage of apolypeptide in a cancerous cell that does not usually occur in a normalcell, or a lack of cleavage in a cancerous cell, wherein the polypeptideis cleaved in a normal cell. Aberrant protein-protein interactions maybe either covalent cross-linking or non-covalent binding betweenproteins that do not normally bind to each other. Alternatively, in acancerous cell, a protein may fail to bind to another protein to whichit is bound in a noncancerous cell. Alterations in cleavage or inprotein-protein interactions may be due to over- or underproduction of apolypeptide in a cancerous cell compared to that in a normal cell, ormay be due to alterations in post-translational modifications (seeabove) of one or more proteins in the cancerous cell. See, e.g.,Henschen-Edman, Ann. N.Y. Acad. Sci. 936: 580-593 (2001).

[0188] Alterations in polypeptide post-translational modifications, aswell as changes in polypeptide cleavage and protein-proteininteractions, may be determined by any method known in the art. Forinstance, alterations in phosphorylation may be determined by usinganti-phosphoserine, anti-phosphothreonine or anti-phosphotyrosineantibodies or by amino acid analysis. Glycosylation alterations may bedetermined using antibodies specific for different sugar residues, bycarbohydrate sequencing, or by alterations in the size of theglycoprotein, which can be determined by, e.g., SDS polyacrylamide gelelectrophoresis (PAGE). Other alterations of post-translationalmodifications, such as prenylation, racemization, methylation,acetylation and arginylation, may be determined by chemical analysis,protein sequencing, amino acid analysis, or by using antibodies specificfor the particular post-translational modifications. Changes inprotein-protein interactions and in polypeptide cleavage may be analyzedby any method known in the art including, without limitation,non-denaturing PAGE (for non-covalent protein-protein interactions), SDSPAGE (for covalent protein-protein interactions and protein cleavage),chemical cleavage, protein sequencing or immunoassays.

[0189] In another embodiment, the invention provides polypeptides thathave been post-translationally modified. In one embodiment, polypeptidesmay be modified enzymatically or chemically, by addition or removal of apost-translational modification. For example, a polypeptide may beglycosylated or deglycosylated enzymatically. Similarly, polypeptidesmay be phosphorylated using a purified kinase, such as a MAP kinase(e.g, p38, ERK, or JNK) or a tyrosine kinase (e.g., Src or erbB2). Apolypeptide may also be modified through synthetic chemistry.Alternatively, one may isolate the polypeptide of interest from a cellor tissue that expresses the polypeptide with the desiredpost-translational modification. In another embodiment, a nucleic acidmolecule encoding the polypeptide of interest is introduced into a hostcell that is capable of post-translationally modifying the encodedpolypeptide in the desired fashion. If the polypeptide does not containa motif for a desired post-translational modification, one may alter thepost-translational modification by mutating the nucleic acid sequence ofa nucleic acid molecule encoding the polypeptide so that it contains asite for the desired post-translational modification. Amino acidsequences that may be post-translationally modified are known in theart. See, e.g., the programs described above on the websitewww.expasy.org. The nucleic acid molecule is then be introduced into ahost cell that is capable of post-translationally modifying the encodedpolypeptide. Similarly, one may delete sites that arepost-translationally modified by either mutating the nucleic acidsequence so that the encoded polypeptide does not contain thepost-translational modification motif, or by introducing the nativenucleic acid molecule into a host cell that is not capable ofpost-translationally modifying the encoded polypeptide.

[0190] In selecting an expression control sequence, a variety of factorsshould also be considered. These include, for example, the relativestrength of the sequence, its controllability, and its compatibilitywith the nucleic acid sequence of this invention, particularly withregard to potential secondary structures. Unicellular hosts should beselected by consideration of their compatibility with the chosen vector,the toxicity of the product coded for by the nucleic acid sequences ofthis invention, their secretion characteristics, their ability to foldthe polypeptide correctly, their fermentation or culture requirements,and the ease of purification from them of the products coded for by thenucleic acid sequences of this invention.

[0191] The recombinant nucleic acid molecules and more particularly, theexpression vectors of this invention may be used to express thepolypeptides of this invention as recombinant polypeptides in aheterologous host cell. The polypeptides of this invention may befull-length or less than full-length polypeptide fragments recombinantlyexpressed from the nucleic acid sequences according to this invention.Such polypeptides include analogs, derivatives and muteins that may ormay not have biological activity.

[0192] Vectors of the present invention will also often include elementsthat permit in vitro transcription of RNA from the inserted heterologousnucleic acid. Such vectors typically include a phage promoter, such asthat from T7, T3, or SP6, flanking the nucleic acid insert. Often twodifferent such promoters flank the inserted nucleic acid, permittingseparate in vitro production of both sense and antisense strands.

[0193] Transformation and other methods of introducing nucleic acidsinto a host cell (e.g., conjugation, protoplast transformation orfusion, transfection, electroporation, liposome delivery, membranefusion techniques, high velocity DNA-coated pellets, viral infection andprotoplast fusion) can be accomplished by a variety of methods which arewell-known in the art (See, for instance, Ausubel, supra, and Sambrooket al., supra). Bacterial, yeast, plant or mammalian cells aretransformed or transfected with an expression vector, such as a plasmid,a cosmid, or the like, wherein the expression vector comprises thenucleic acid of interest. Alternatively, the cells may be infected by aviral expression vector comprising the nucleic acid of interest.Depending upon the host cell, vector, and method of transformation used,transient or stable expression of the polypeptide will be constitutiveor inducible. One having ordinary skill in the art will be able todecide whether to express a polypeptide transiently or stably, andwhether to express the protein constitutively or inducibly.

[0194] A wide variety of unicellular host cells are useful in expressingthe DNA sequences of this invention. These hosts may include well-knowneukaryotic and prokaryotic hosts, such as strains of, fungi, yeast,insect cells such as Spodoptera frugiperda (SF9), animal cells such asCHO, as well as plant cells in tissue culture. Representative examplesof appropriate host cells include, but are not limited to, bacterialcells, such as E. coli, Caulobacter crescentus, Streptomyces species,and Salmonella typhimurium; yeast cells, such as Saccharomycescerevisiae, Schizosaccharomyces pombe, Pichia pastoris, Pichiamethanolica; insect cell lines, such as those from Spodopterafrugiperda, e.g., Sf9 and Sf21 cell lines, and expresSF™ cells (ProteinSciences Corp., Meriden, Conn., USA), Drosophila S2 cells, andTrichoplusia ni High Five® Cells (Invitrogen, Carlsbad, Calif., USA);and mammalian cells. Typical mammalian cells include BHK cells, BSC 1cells, BSC 40 cells, BMT 10 cells, VERO cells, COS1 cells, COS7 cells,Chinese hamster ovary (CHO) cells, 3T3 cells, NIH 3T3 cells, 293 cells,HEPG2 cells, HeLa cells, L cells, MDCK cells, HEK293 cells, WI38 cells,murine ES cell lines (e.g., from strains 129/SV, C57/BL6, DBA-1,129/SVJ), K562 cells, Jurkat cells, and BW5147 cells. Other mammaliancell lines are well-known and readily available from the American TypeCulture Collection (ATCC) (Manassas, Va., USA) and the NationalInstitute of General Medical Sciences (NIGMS) Human Genetic CellRepository at the Coriell Cell Repositories (Camden, N.J., USA). Cellsor cell lines derived from breast are particularly preferred becausethey may provide a more native post-translational processing.Particularly preferred are human breast cells.

[0195] Particular details of the transfection, expression andpurification of recombinant proteins are well documented and areunderstood by those of skill in the art. Further details on the varioustechnical aspects of each of the steps used in recombinant production offoreign genes in bacterial cell expression systems can be found in anumber of texts and laboratory manuals in the art. See, e.g., Ausubel(1992), supra, Ausubel (1999), supra, Sambrook (1989), supra, andSambrook (2001), supra, herein incorporated by reference.

[0196] Methods for introducing the vectors and nucleic acids of thepresent invention into the host cells are well-known in the art; thechoice of technique will depend primarily upon the specific vector to beintroduced and the host cell chosen.

[0197] Nucleic acid molecules and vectors may be introduced intoprokaryotes, such as E. coli, in a number of ways. For instance, phagelambda vectors will typically be packaged using a packaging extract(e.g., Gigapack® packaging extract, Stratagene, La Jolla, Calif., USA),and the packaged virus used to infect E. coli.

[0198] Plasmid vectors will typically be introduced into chemicallycompetent or electrocompetent bacterial cells. E. coli cells can berendered chemically competent by treatment, e.g., with CaCl₂, or asolution of Mg²⁺, Mn²⁺, Ca²⁺, Rb⁺ or K⁺, dimethyl sulfoxide,dithiothreitol, and hexamine cobalt (III), Hanahan, J. Mol. Biol.166(4):557-80 (1983), and vectors introduced by heat shock. A widevariety of chemically competent strains are also available commercially(e.g., Epicurian Coli® XL10-Gold® Ultracompetent Cells (Stratagene, LaJolla, Calif., USA); DH5 competent cells (Clontech Laboratories, PaloAlto, Calif., USA); and TOP10 Chemically Competent E. coli Kit(Invitrogen, Carlsbad, Calif., USA)). Bacterial cells can be renderedelectrocompetent, that is, competent to take up exogenous DNA byelectroporation, by various pre-pulse treatments; vectors are introducedby electroporation followed by subsequent outgrowth in selected media.An extensive series of protocols is provided online in Electroprotocols(BioRad, Richmond, Calif., USA) (http://www.biorad.com/LifeScience/pdf,New_Gene_Pulser.pdf).

[0199] Vectors can be introduced into yeast cells by spheroplasting,treatment with lithium salts, electroporation, or protoplast fusion.Spheroplasts are prepared by the action of hydrolytic enzymes such assnail-gut extract, usually denoted Glusulase, or Zymolyase, an enzymefrom Arthrobacter luteus, to remove portions of the cell wall in thepresence of osmotic stabilizers, typically 1 M sorbitol. DNA is added tothe spheroplasts, and the mixture is co-precipitated with a solution ofpolyethylene glycol (PEG) and Ca²⁺. Subsequently, the cells areresuspended in a solution of sorbitol, mixed with molten agar and thenlayered on the surface of a selective plate containing sorbitol.

[0200] For lithium-mediated transformation, yeast cells are treated withlithium acetate, which apparently peineabilizes the cell wall, DNA isadded and the cells are co-precipitated with PEG. The cells are exposedto a brief heat shock, washed free of PEG and lithium acetate, andsubsequently spread on plates containing ordinary selective medium.Increased frequencies of transformation are obtained by usingspecially-prepared single-stranded carrier DNA and certain organicsolvents. Schiestl et al., Curr. Genet. 16(5-6): 339-46 (1989).

[0201] For electroporation, freshly-grown yeast cultures are typicallywashed, suspended in an osmotic protectant, such as sorbitol, mixed withDNA, and the cell suspension pulsed in an electroporation device.Subsequently, the cells are spread on the surface of plates containingselective media. Becker et al., Methods Enzymol. 194: 182-187 (1991).The efficiency of transformation by electroporation can be increasedover 100-fold by using PEG, single-stranded carrier DNA and cells thatare in late log-phase of growth. Larger constructs, such as YACs, can beintroduced by protoplast fusion.

[0202] Mammalian and insect cells can be directly infected by packagedviral vectors, or transfected by chemical or electrical means. Forchemical transfection, DNA can be coprecipitated with CaPO₄ orintroduced using liposomal and nonliposomal lipid-based agents.Commercial kits are available for CaPO₄ transfection (CalPhos™ MammalianTransfection Kit, Clontech Laboratories, Palo Alto, Calif., USA), andlipid-mediated transfection can be practiced using commercial reagents,such as LIPOFECTAMINE™ 2000, LIPOFECTAMINE™ Reagent, CELLFECTIN®Reagent, and LIPOFECTIN® Reagent (Invitrogen, Carlsbad, Calif., USA),DOTAP Liposomal Transfection Reagent, FuGENE 6, X-tremeGENE Q2, DOSPER,(Roche Molecular Biochemicals, Indianapolis, Ind. USA), Effectene™,PolyFectg®, Superfect® (Qiagen, kIc., Valencia, Calif., USA). Protocolsfor electroporating mammalian cells can be found online inElectroprotocols (Bio-Rad, Richmond, Calif., USA)(http://www.bio-rad.com/LifeScience/pdf, New_Gene_Pulser.pdf); Norton etal. (eds.), Gene Transfer Methods: Introducing DNA into Living Cells andOrganisms, BioTechniques Books, Eaton Publishing Co. (2000);incorporated herein by reference in its entirety. Other transfectiontechniques include transfection by particle bombardment andmicroinjection. See, e.g., Cheng et al., Proc. Natl. Acad. Sci. USA90(10): 4455-9 (1993); Yang et al., Proc. Nati. Acad. Sci. USA 87(24):9568-72 (1990).

[0203] Production of the recombinantly produced proteins of the presentinvention can optionally be followed by purification.

[0204] Purification of recombinantly expressed proteins is now well bythose skilled in the art. See, e.g., Thorner et al. (eds.), Applicationsof Chimeric Genes and Hybrid Proteins, Part A: Gene Expression andProtein Purification (Methods in Enzymology, Vol. 326), Academic Press(2000); Harbin (ed.), Cloning Gene Expression and Protein Purification:Experimental Procedures and Process Rationale, Oxford Univ. Press(2001); Marshak et al., Strategies for Protein Purification andCharacterization: A Laboratory Course Manual, Cold Spring HarborLaboratory Press (1996); and Roe (ed.), Protein PurificationApplications, Oxford University Press (2001); the disclosures of whichare incorporated herein by reference in their entireties, and thus neednot be detailed here.

[0205] Briefly, however, if purification tags have been fused throughuse of an expression vector that appends such tags, purification can beeffected, at least in part, by means appropriate to the tag, such as useof immobilized metal affinity chromatography for polyhistidine tags.Other techniques common in the art include ammonium sulfatefractionation, immunoprecipitation, fast protein liquid chromatography(FPLC), high performance liquid chromatography (HPLC), and preparativegel electrophoresis.

[0206] Polypeptides

[0207] Another object of the invention is to provide polypeptidesencoded by the nucleic acid molecules of the instant invention. In apreferred embodiment, the polypeptide is a breast specific polypeptide(BSP). In an even more preferred embodiment, the polypeptide is derivedfrom a polypeptide comprising the amino acid sequence of SEQ ID NO: 116through 218. A polypeptide as defined herein may be producedrecombinantly, as discussed supra, may be isolated from a cell thatnaturally expresses the protein, or may be chemically synthesizedfollowing the teachings of the specification and using methodswell-known to those having ordinary skill in the art.

[0208] In another aspect, the polypeptide may comprise a fragment of apolypeptide, wherein the fragment is as defined herein. In a preferredembodiment, the polypeptide fragment is a fragment of a BSP. In a morepreferred embodiment, the fragment is derived from a polypeptidecomprising the amino acid sequence of SEQ ID NO: 116 through 218. Apolypeptide that comprises only a fragment of an entire BSP may or maynot be a polypeptide that is also a BSP. For instance, a full-lengthpolypeptide may be breast-specific, while a fragment thereof may befound in other tissues as well as in breast. A polypeptide that is not aBSP, whether it is a fragment, analog, mutein, homologous protein orderivative, is nevertheless useful, especially for immunizing animals toprepare anti-BSP antibodies. However, in a preferred embodiment, thepart or fragment is a BSP. Methods of determining whether a polypeptideis a BSP are described infra.

[0209] Fragments of at least 6 contiguous amino acids are useful inmapping B cell and T cell epitopes of the reference protein. See, e.g.,Geysen et al., Proc. Natl. Acad. Sci. USA 81: 3998-4002 (1984) and U.S.Pat. Nos. 4,708,871 and 5,595,915, the disclosures of which areincorporated herein by reference in their entireties. Because thefragment need not itself be immunogenic, part of an immunodominantepitope, nor even recognized by native antibody, to be useful in suchepitope mapping, all fragments of at least 6 amino acids of the proteinsof the present invention have utility in such a study.

[0210] Fragments of at least 8 contiguous amino acids, often at least 15contiguous amino acids, are useful as immunogens for raising antibodiesthat recognize the proteins of the present invention. See, e.g., Lerner,Nature 299: 592-596 (1982); Shinnick et al., Annu. Rev. Microbiol. 37:425-46 (1983); Sutcliffe et al., Science 219: 660-6 (1983), thedisclosures of which are incorporated herein by reference in theirentireties. As further described in the above-cited references,virtually all 8-mers, conjugated to a carrier, such as a protein, proveimmunogenic, meaning that they are capable of eliciting antibody for theconjugated peptide; accordingly, all fragments of at least 8 amino acidsof the proteins of the present invention have utility as immunogens.

[0211] Fragments of at least 8, 9, 10 or 12 contiguous amino acids arealso useful as competitive inhibitors of binding of the entire protein,or a portion thereof, to antibodies (as in epitope mapping), and tonatural binding partners, such as subunits in a multimeric complex or toreceptors or ligands of the subject protein; this competitive inhibitionpermits identification and separation of molecules that bindspecifically to the protein of interest, U.S. Pat. Nos. 5,539,084 and5,783,674, incorporated herein by reference in their entireties.

[0212] The protein, or protein fragment, of the present invention isthus at least 6 amino acids in length, typically at least 8, 9, 10 or 12amino acids in length, and often at least 15 amino acids in length.Often, the protein of the present invention, or fragment thereof, is atleast 20 amino acids in length, even 25 amino acids, 30 amino acids, 35amino acids, or 50 amino acids or more in length. Of course, largerfragments having at least 75 amino acids, 100 amino acids, or even 150amino acids are also useful, and at times preferred.

[0213] One having ordinary skill in the art can produce fragments of apolypeptide by truncating the nucleic acid molecule, e.g., a BSNA,encoding the polypeptide and then expressing it recombinantly.Alternatively, one can produce a fragment by chemically synthesizing aportion of the full-length polypeptide. One may also produce a fragmentby enzymatically cleaving either a recombinant polypeptide or anisolated naturally-occurring polypeptide. Methods of producingpolypeptide fragments are well-known in the art. See, e.g., Sambrook(1989), supra; Sambrook (2001), supra; Ausubel (1992), supra; andAusubel (1999), supra. In one embodiment, a polypeptide comprising onlya fragment of polypeptide of the invention, preferably a BSP, may beproduced by chemical or enzymatic cleavage of a polypeptide. In apreferred embodiment, a polypeptide fragment is produced by expressing anucleic acid molecule encoding a fragment of the polypeptide, preferablya BSP, in a host cell.

[0214] By “polypeptides” as used herein it is also meant to be inclusiveof mutants, fusion proteins, homologous proteins and allelic variants ofthe polypeptides specifically exemplified.

[0215] A mutant protein, or mutein, may have the same or differentproperties compared to a naturally-occurring polypeptide and comprisesat least one amino acid insertion, duplication, deletion, rearrangementor substitution compared to the amino acid sequence of a native protein.Small deletions and insertions can often be found that do not alter thefunction of the protein. In one embodiment, the mutein may or may not bebreast-specific. In a preferred embodiment, the mutein isbreast-specific. In a preferred embodiment, the mutein is a polypeptidethat comprises at least one amino acid insertion, duplication, deletion,rearrangement or substitution compared to the amino acid sequence of SEQID NO: 116 through 218. In a more preferred embodiment, the mutein isone that exhibits at least 50% sequence identity, more preferably atleast 60% sequence identity, even more preferably at least 70%, yet morepreferably at least 80% sequence identity to a BSP comprising an aminoacid sequence of SEQ ID NO: 116 through 218. In yet a more preferredembodiment, the mutein exhibits at least 85%, more preferably 90%, evenmore preferably 95% or 96%, and yet more preferably at least 97%, 98%,99% or 99.5% sequence identity to a BSP comprising an amino acidsequence of SEQ ID NO: 116 through 218.

[0216] A mutein may be produced by isolation from a naturally-occurringmutant cell, tissue or organism. A mutein may be produced by isolationfrom a cell, tissue or organism that has been experimentallymutagenized. Alternatively, a mutein may be produced by chemicalmanipulation of a polypeptide, such as by altering the amino acidresidue to another amino acid residue using synthetic or semi-syntheticchemical techniques. In a preferred embodiment, a mutein may be producedfrom a host cell comprising an altered nucleic acid molecule compared tothe naturally-occurring nucleic acid molecule. For instance, one mayproduce a mutein of a polypeptide by introducing one or more mutationsinto a nucleic acid sequence of the invention and then expressing itrecombinantly. These mutations may be targeted, in which particularencoded amino acids are altered, or may be untargeted, in which randomencoded amino acids within the polypeptide are altered. Muteins withrandom amino acid alterations can be screened for a particularbiological activity or property, particularly whether the polypeptide isbreast-specific, as described below. Multiple random mutations can beintroduced into the gene by methods well-known to the art, e.g., byerror-prone PCR, shuffling, oligonucleotide-directed mutagenesis,assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassettemutagenesis, recursive ensemble mutagenesis, exponential ensemblemutagenesis and site-specific mutagenesis. Methods of producing muteinswith targeted or random amino acid alterations are well-known in theart. See, e.g., Sambrook (1989), supra; Sambrook (2001), supra; Ausubel(1992), supra; and Ausubel (1999), U.S. Pat. No. 5,223,408, and thereferences discussed supra, each herein incorporated by reference.

[0217] By “polypeptide” as used herein it is also meant to be inclusiveof polypeptides homologous to those polypeptides exemplified herein. Ina preferred embodiment, the polypeptide is homologous to a BSP. In aneven more preferred embodiment, the polypeptide is homologous to a BSPselected from the group having an amino acid sequence of SEQ ID NO: 116through 218. In a preferred embodiment, the homologous polypeptide isone that exhibits significant sequence identity to a BSP. In a morepreferred embodiment, the polypeptide is one that exhibits significantsequence identity to an comprising an amino acid sequence of SEQ ID NO:116 through 218. In an even more preferred embodiment, the homologouspolypeptide is one that exhibits at least 50% sequence identity, morepreferably at least 60% sequence identity, even more preferably at least70%, yet more preferably at least 80% sequence identity to a BSPcomprising an amino acid sequence of SEQ ID NO: 116 through 218. In ayet more preferred embodiment, the homologous polypeptide is one thatexhibits at least 85%, more preferably 90%, even more preferably 95% or96%, and yet more preferably at least 97% or 98% sequence identity to aBSP comprising an amino acid sequence of SEQ ID NO: 116 through 218. Inanother preferred embodiment, the homologous polypeptide is one thatexhibits at least 99%, more preferably 99.5%, even more preferably99.6%, 99.7%, 99.8% or 99.9% sequence identity to a BSP comprising anamino acid sequence of SEQ ID NO: 116 through 218. In a preferredembodiment, the amino acid substitutions are conservative amino acidsubstitutions as discussed above.

[0218] In another embodiment, the homologous polypeptide is one that isencoded by a nucleic acid molecule that selectively hybridizes to aBSNA. In a preferred embodiment, the homologous polypeptide is encodedby a nucleic acid molecule that hybridizes to a BSNA under lowstringency, moderate stringency or high stringency conditions, asdefined herein. In a more preferred embodiment, the BSNA is selectedfrom the group consisting of SEQ ID NO: 1 through 115. In anotherpreferred embodiment, the homologous polypeptide is encoded by a nucleicacid molecule that hybridizes to a nucleic acid molecule that encodes aBSP under low stringency, moderate stringency or high stringencyconditions, as defined herein. In a more preferred embodiment, the BSPis selected from the group consisting of SEQ ID NO: 116 through 218.

[0219] The homologous polypeptide may be a naturally-occurring one thatis derived from another species, especially one derived from anotherprimate, such as chimpanzee, gorilla, rhesus macaque, baboon or gorilla,wherein the homologous polypeptide comprises an amino acid sequence thatexhibits significant sequence identity to that of SEQ ID NO: 116 through218. The homologous polypeptide may also be a naturally-occurringpolypeptide from a human, when the BSP is a member of a family ofpolypeptides. The homologous polypeptide may also be anaturally-occurring polypeptide derived from a non-primate, mammalianspecies, including without limitation, domesticated species, e.g., dog,cat, mouse, rat, rabbit, guinea pig, hamster, cow, horse, goat or pig.The homologous polypeptide may also be a naturally-occurring polypeptidederived from a non-mammalian species, such as birds or reptiles. Thenaturally-occurring homologous protein may be isolated directly fromhumans or other species. Alternatively, the nucleic acid moleculeencoding the naturally-occurring homologous polypeptide may be isolatedand used to express the homologous polypeptide recombinantly. In anotherembodiment, the homologous polypeptide may be one that is experimentallyproduced by random mutation of a nucleic acid molecule and subsequentexpression of the nucleic acid molecule. In another embodiment, thehomologous polypeptide may be one that is experimentally produced bydirected mutation of one or more codons to alter the encoded amino acidof a BSP. Further, the homologous protein may or may not encodepolypeptide that is a BSP. However, in a preferred embodiment, thehomologous polypeptide encodes a polypeptide that is a BSP.

[0220] Relatedness of proteins can also be characterized using a secondfunctional test, the ability of a first protein competitively to inhibitthe binding of a second protein to an antibody. It is, therefore,another aspect of the present invention to provide isolated proteins notonly identical in sequence to those described with particularity herein,but also to provide isolated proteins (“cross-reactive proteins”) thatcompetitively inhibit the binding of antibodies to all or to a portionof various of the isolated polypeptides of the present invention. Suchcompetitive inhibition can readily be determined using immunoassayswell-known in the art.

[0221] As discussed above, single nucleotide polymorphisms (SNPs) occurfrequently in eukaryotic genomes, and the sequence determined from oneindividual of a species may differ from other allelic forms presentwithin the population. Thus, by “polypeptide” as used herein it is alsomeant to be inclusive of polypeptides encoded by an allelic variant of anucleic acid molecule encoding a BSP. In a preferred embodiment, thepolypeptide is encoded by an allelic variant of a gene that encodes apolypeptide having the amino acid sequence selected from the groupconsisting of SEQ ID NO: 116 through 218. In a yet more preferredembodiment, the polypeptide is encoded by an allelic variant of a genethat has the nucleic acid sequence selected from the group consisting ofSEQ ID NO: 1 through 115.

[0222] In another embodiment, the invention provides polypeptides whichcomprise derivatives of a polypeptide encoded by a nucleic acid moleculeaccording to the instant invention. In a preferred embodiment, thepolypeptide is a BSP. In a preferred embodiment, the polypeptide has anamino acid sequence selected from the group consisting of SEQ ID NO: 116through 218, or is a mutein, allelic variant, homologous protein orfragment thereof. In a preferred embodiment, the derivative has beenacetylated, carboxylated, phosphorylated, glycosylated or ubiquitinated.In another preferred embodiment, the derivative has been labeled with,e.g., radioactive isotopes such as ¹²⁵I, ³²P, ³⁵S, and ³H. In anotherpreferred embodiment, the derivative has been labeled with fluorophores,chemiluminescent agents, enzymes, and antiligands that can serve asspecific binding pair members for a labeled ligand.

[0223] Polypeptide modifications are well-known to those of skill andhave been described in great detail in the scientific literature.Several particularly common modifications, glycosylation, lipidattachment, sulfation, gamma-carboxylation of glutamic acid residues,hydroxylation and ADP-ribosylation, for instance, are described in mostbasic texts, such as, for instance Creighton, Protein Structure andMolecular Properties, 2nd ed., W. H. Freeman and Company (1993). Manydetailed reviews are available on this subject, such as, for example,those provided by Wold, in Johnson (ed.), Posttranslational CovalentModification of Proteins, pgs. 1-12, Academic Press (1983); Seifter etal., Meth. Enzymol. 182: 626-646 (1990) and Rattan et al., Ann. N.Y.Acad. Sci. 663: 48-62 (1992).

[0224] It will be appreciated, as is well-known and as noted above, thatpolypeptides are not always entirely linear. For instance, polypeptidesmay be branched as a result of ubiquitination, and they may be circular,with or without branching, generally as a result of posttranslationevents, including natural processing event and events brought about byhuman manipulation which do not occur naturally. Circular, branched andbranched circular polypeptides may be synthesized by non-translationnatural process and by entirely synthetic methods, as well.Modifications can occur anywhere in a polypeptide, including the peptidebackbone, the amino acid side-chains and the amino or carboxyl termini.In fact, blockage of the amino or carboxyl group in a polypeptide, orboth, by a covalent modification, is common in naturally occurring andsynthetic polypeptides and such modifications may be present inpolypeptides of the present invention, as well. For instance, the aminoterminal residue of polypeptides made in E. coli, prior to proteolyticprocessing, almost invariably will be N-formylmethionine.

[0225] Useful post-synthetic (and post-translational) modificationsinclude conjugation to detectable labels, such as fluorophores. A widevariety of amine-reactive and thiol-reactive fluorophore derivativeshave been synthesized that react under nondenaturing conditions withN-terminal amino groups and epsilon amino groups of lysine residues, onthe one hand, and with free thiol groups of cysteine residues, on theother.

[0226] Kits are available commercially that permit conjugation ofproteins to a variety of arnine-reactive or thiol-reactive fluorophores:Molecular Probes, Inc. (Eugene, Oreg., USA), e.g., offers kits forconjugating proteins to Alexa Fluor 350, Alexa Fluor 430,Fluorescein-EX, Alexa Fluor 488, Oregon Green 488, Alexa Fluor 532,Alexa Fluor 546, Alexa Fluor 546, Alexa Fluor 568, Alexa Fluor 594, andTexas Red-X.

[0227] A wide variety of other amine-reactive and thiol-reactivefluorophores are available commercially (Molecular Probes, Inc., Eugene,Oreg., USA), including Alexa Fluor 350, Alexa Fluor 488, Alexa Fluor532, Alexa Fluor® 546, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor®647 (monoclonal antibody labeling kits available from Molecular Probes,Inc., Eugene, Oreg., USA), BODIPY dyes, such as BODIPY 493/503, BODIPYFL, BODIPY R6G, BODIPY 530/550, BODIPY TMR, BODIPY 558/568, BODIPY558/568, BODIPY 564/570, BODIPY 576/589, BODIPY 581/591, BODIPY TR,BODIPY 630/650, BODIPY 650/665, Cascade Blue, Cascade Yellow, Dansyl,lissamine rhodamine B, Marina Blue, Oregon Green 488, Oregon Green 514,Pacific Blue, rhodamine 6G, rhodamine green, rhodamine red,tetramethylrhodamine, Texas Red (available from Molecular Probes, Inc.,Eugene, Oreg., USA).

[0228] The polypeptides of the present invention can also be conjugatedto fluorophores, other proteins, and other macromolecules, usingbifunctional linking reagents. Common homobifunctional reagents include,e.g., APG, AEDP, BASED, BMB, BMDB, BMH, BMOE, BM[PEO]3, BM[PEO]4, BS3,BSOCOES, DFDNB, DMA, DMP, DMS, DPDPB, DSG, DSP (Lomant's Reagent), DSS,DST, DTBP, DTME, DTSSP, EGS, HBVS, Sulfo-BSOCOES, Sulfo-DST, Sulfo-EGS(all available from Pierce, Rockford, Ill., USA); cormmonheterobifunctional cross-linkers include ABH, AMAS, ANB-NOS, APDP, ASBA,BMPA, BMPH, BMPS, EDC, EMCA, EMCH, EMCS, KMUA, KMUH, GMBS, LC-SMCC,LC-SPDP, MBS, M2C2H, MPBH, MSA, NHS-ASA, PDPH, PMPI, SADP, SAED, SAND,SANPAH, SASD, SATP, SBAP, SFAD, SIA, SIAB, SMCC, SMPB, SMPH, SMPT, SPDP,Sulfo-EMCS, Sulfo-GMBS, Sulfo-HSAB, Sulfo-KMUS, Sulfo-LC-SPDP,Sulfo-MBS, Sulfo-NHS-LC-ASA, Sulfo-SADP, Sulfo-SANPAH, Sulfo-SIAB,Sulfo-SMCC, Sulfo-SMPB, Sulfo-LC-SMPT, SVSB, TFCS (all available Pierce,Rockford, Ill., USA).

[0229] The polypeptides, fragments, and fusion proteins of the presentinvention can be conjugated, using such cross-linking reagents, tofluorophores that are not amine- or thiol-reactive. Other labels thatusefully can be conjugated to the polypeptides, fragments, and fusionproteins of the present invention include radioactive labels,echosonographic contrast reagents, and MRI contrast agents.

[0230] The polypeptides, fragments, and fusion proteins of the presentinvention can also usefully be conjugated using cross-linking agents tocarrier proteins, such as KLH, bovine thyroglobulin, and even bovineserum albumin (BSA), to increase immunogenicity for raising anti-BSPantibodies.

[0231] The polypeptides, fragments, and fusion proteins of the presentinvention can also usefully be conjugated to polyethylene glycol (PEG);PEGylation increases the serum half-life of proteins administeredintravenously for replacement therapy. Delgado et al., Crit. Rev. Ther.Drug Carrier Syst. 9(3-4): 249-304 (1992); Scott et al., Curr. Pharm.Des. 4(6): 423-38 (1998); DeSantis et al., Curr. Opin. Biotechnol.10(4): 324-30 (1999), incorporated herein by reference in theirentireties. PEG monomers can be attached to the protein directly orthrough a linker, with PEGylation using PEG monomers activated withtresyl chloride (2,2,2-trifluoroethanesulphonyl chloride) permittingdirect attachment under mild conditions.

[0232] In yet another embodiment, the invention provides analogs of apolypeptide encoded by a nucleic acid molecule according to the instantinvention. In a preferred embodiment, the polypeptide is a BSP. In amore preferred embodiment, the analog is derived from a polypeptidehaving part or all of the amino acid sequence of SEQ ID NO: 116 through218. In a preferred embodiment, the analog is one that comprises one ormore substitutions of non-natural amino acids or non-nativeinter-residue bonds compared to the naturally-occurring polypeptide. Ingeneral, the non-peptide analog is structurally similar to a BSP, butone or more peptide linkages is replaced by a linkage selected from thegroup consisting of —CH₂NH—, —CH₂S—, —CH₂—CH₂—, —CH═CH-(cis and trans),—COCH₂—, —CH(OH)CH₂— and —CH₂SO—. In another embodiment, the non-peptideanalog comprises substitution of one or more amino acids of a BSP with aD-amino acid of the same type or other non-natural amino acid in orderto generate more stable peptides. D-amino acids can readily beincorporated during chemical peptide synthesis: peptides assembled fromD-amino acids are more resistant to proteolytic attack; incorporation ofD-amino acids can also be used to confer specific three-dimensionalconformations on the peptide. Other amino acid analogues commonly addedduring chemical synthesis include omithine, norleucine, phosphorylatedamino acids (typically phosphoserine, phosphothreonine,phosphotyrosine), L-malonyltyrosine, a non-hydrolyzable analog ofphosphotyrosine (see, e.g., Kole et al., Biochem. Biophys. Res. Com.209: 817-821 (1995)), and various halogenated phenylalanine derivatives.

[0233] Non-natural amino acids can be incorporated during solid phasechemical synthesis or by recombinant techniques, although the former istypically more common. Solid phase chemical synthesis of peptides iswell established in the art. Procedures are described, inter alia, inChan et al. (eds.), Fmoc Solid Phase Peptide Svnthesis: A PracticalApproach (Practical Approach Series), Oxford Univ. Press (March 2000);Jones, Amino Acid and Peptide Synthesis (Oxford Chemistry Primers, No7), Oxford Univ. Press (1992); and Bodanszky, Principles of PeptideSynthesis (Springer Laboratory), Springer Verlag (1993); the disclosuresof which are incorporated herein by reference in their entireties.

[0234] Amino acid analogues having detectable labels are also usefullyincorporated during synthesis to provide derivatives and analogs.Biotin, for example can be added usingbiotinoyl-(9-fluorenylmethoxycarbonyl)-L-lysine (FMOC biocytin)(Molecular Probes, Eugene, Oreg., USA). Biotin can also be addedenzymatically by incorporation into a fusion protein of a E. coli BirAsubstrate peptide. The FMOC and tBOC derivatives of dabcyl-L-lysine(Molecular Probes, Inc., Eugene, Oreg., USA) can be used to incorporatethe dabcyl chromophore at selected sites in the peptide sequence duringsynthesis. The aminonaphthalene derivative EDANS, the most commonfluorophore for pairing with the dabcyl quencher in fluorescenceresonance energy transfer (FRET) systems, can be introduced duringautomated synthesis of peptides by using EDANS-FMOC-L-glutamic acid orthe corresponding tBOC derivative (both from Molecular Probes, Inc.,Eugene, Oreg., USA). Tetramethylrhodamine fluorophores can beincorporated during automated FMOC synthesis of peptides using(FMOC)-TMR-L-lysine (Molecular Probes, Inc. Eugene, Oreg., USA).

[0235] Other useful amino acid analogues that can be incorporated duringchemical synthesis include aspartic acid, glutamic acid, lysine, andtyrosine analogues having allyl side-chain protection (AppliedBiosystems, Inc., Foster City, Calif., USA); the allyl side chainpermits synthesis of cyclic, branched-chain, sulfonated, glycosylated,and phosphorylated peptides.

[0236] A large number of other FMOC-protected non-natural amino acidanalogues capable of incorporation during chemical synthesis areavailable commercially, including, e.g.,Fmoc-2-aminobicyclo[2.2.1]heptane-2-carboxylic acid,Fmoc-3-endo-aminobicyclo[2.2.1]heptane-2-endo-carboxylic acid,Fmoc-3-exo-aminobicyclo[2.2.1]heptane-2-exo-carboxylic acid,Fmoc-3-endo-amino-bicyclo[2.2.1 ]hept-5-ene-2-endo-carboxylic acid,Fmoc-3-exo-amino-bicyclo [2.2.1 ]hept-5-ene-2-exo-carboxylic acid,Fmoc-cis-2-amino-1-cyclohexanecarboxylic acid,Fmoc-trans-2-amino-1-cyclohexanecarboxylic acid,Fmoc-1-amino-1-cyclopentanecarboxylic acid,Fmoc-cis-2-amino-1-cyclopentanecarboxylic acid,Fmoc-1-amino-1-cyclopropanecarboxylic acid,Fmoc-D-2-amino-4-(ethylthio)butyric acid,Fmoc-L-2-amino-4-(ethylthio)butyric acid, Fmoc-L-buthionine,Fmoc-S-methyl-L-Cysteine, Fmoc-2-aminobenzoic acid (anthranillic acid),Fmoc-3-aminobenzoic acid, Fmoc-4-aminobenzoic acid,Fmoc-2-aminobenzophenone-2′-carboxylic acid,Fmoc-N-(4-aminobenzoyl)-β-alanine, Fmoc-2-amino-4,5-dimethoxybenzoicacid, Fmoc-4-aminohippuric acid, Fmoc-2-amino-3-hydroxybenzoic acid,Fmoc-2-amino-5-hydroxybenzoic acid, Fmoc-3-amino-4-hydroxybenzoic acid,Fmoc-4-amino-3-hydroxybenzoic acid, Fmoc-4-amino-2-hydroxybenzoic acid,Fmoc-5-amino-2-hydroxybenzoic acid, Fmoc-2-amino-3-methoxybenzoic acid,Fmoc-4-amino-3-methoxybenzoic acid, Fmoc-2-amino-3-methylbenzoic acid,Fmoc-2-amino-5-methylbenzoic acid, Fmoc-2-amino-6-methylbenzoic acid,Fmoc-3-amino-2-methylbenzoic acid, Fmoc-3-amino-4-methylbenzoic acid,Fmoc-4-amino-3-methylbenzoic acid, Fmoc-3-amino-2-naphtoic acid,Fmoc-D,L-3-amino-3-phenylpropionic acid, Fmoc-L-Methyldopa,Fmoc-2-amino-4,6-dimethyl-3-pyridinecarboxylic acid,Fmoc-D,L-amino-2-thiophenacetic acid, Fmoc-4-(carboxymethyl)piperazine,Fmoc-4-carboxypiperazine, Fmoc-4-(carboxymethyl)homopiperazine,Fmoc-4-phenyl-4-piperidinecarboxylic acid,Fmoc-L-1,2,3,4-tetrahydronorharman-3-carboxylic acid,Fmoc-L-thiazolidine-4-carboxylic acid, all available from The PeptideLaboratory (Richmond, Calif., USA).

[0237] Non-natural residues can also be added biosynthetically byengineering a suppressor tRNA, typically one that recognizes the UAGstop codon, by chemical aminoacylation with the desired unnatural aminoacid. Conventional site-directed mutagenesis is used to introduce thechosen stop codon UAG at the site of interest in the protein gene. Whenthe acylated suppressor tRNA and the mutant gene are combined in an invitro transcription/translation system, the unnatural amino acid isincorporated in response to the UAG codon to give a protein containingthat amino acid at the specified position. Liu et al., Proc. Natl Acad.Sci. USA 96(9): 4780-5 (1999); Wang et al, Science 292(5516): 498-500(2001).

[0238] Fusion Proteins

[0239] The present invention further provides fusions of each of thepolypeptides and fragments of the present invention to heterologouspolypeptides. In a preferred embodiment, the polypeptide is a BSP. In amore preferred embodiment, the polypeptide that is fused to theheterologous polypeptide comprises part or all of the amino acidsequence of SEQ ID NO: 116 through 218, or is a mutein, homologouspolypeptide, analog or derivative thereof. In an even more preferredembodiment, the nucleic acid molecule encoding the fusion proteincomprises all or part of the nucleic acid sequence of SEQ ID NO: 1through 115, or comprises all or part of a nucleic acid sequence thatselectively hybridizes or is homologous to a nucleic acid moleculecomprising a nucleic acid sequence of SEQ ID NO: 1 through 115.

[0240] The fusion proteins of the present invention will include atleast one fragment of the protein of the present invention, whichfragment is at least 6, typically at least 8, often at least 15, andusefully at least 16, 17, 18, 19, or 20 amino acids long. The fragmentof the protein of the present to be included in the fusion can usefullybe at least 25 amino acids long, at least 50 amino acids long, and canbe at least 75, 100, or even 150 amino acids long. Fusions that includethe entirety of the proteins of the present invention have particularutility.

[0241] The heterologous polypeptide included within the fusion proteinof the present invention is at least 6 amino acids in length, often atleast 8 amino acids in length, and usefully at least 15, 20, and 25amino acids in length. Fusions that include larger polypeptides, such asthe IgG Fc region, and even entire proteins (such as GFPchromophore-containing proteins) are particular useful.

[0242] As described above in the description of vectors and expressionvectors of the present invention, which discussion is incorporated hereby reference in its entirety, heterologous polypeptides to be includedin the fusion proteins of the present invention can usefully includethose designed to facilitate purification and/or visualization ofrecombinantly-expressed proteins. See, e.g., Ausubel, Chapter 16,(1992), supra. Although purification tags can also be incorporated intofusions that are chemically synthesized, chemical synthesis typicallyprovides sufficient purity that further purification by HPLC suffices;however, visualization tags as above described retain their utility evenwhen the protein is produced by chemical synthesis, and when so includedrender the fusion proteins of the present invention useful as directlydetectable markers of the presence of a polypeptide of the invention.

[0243] As also discussed above, heterologous polypeptides to be includedin the fusion proteins of the present invention can usefully includethose that facilitate secretion of recombinantly expressed proteins—intothe periplasmic space or extracellular milieu for prokaryotic hosts,into the culture medium for eukaryotic cells—through incorporation ofsecretion signals and/or leader sequences. For example, a His⁶ taggedprotein can be purified on a Ni affinity column and a GST fusion proteincan be purified on a glutathione affinity column. Similarly, a fusionprotein comprising the Fc domain of IgG can be purified on a Protein Aor Protein G column and a fusion protein comprising an epitope tag suchas myc can be purified using an immunoaffinity column containing ananti-c-myc antibody. It is preferable that the epitope tag be separatedfrom the protein encoded by the essential gene by an enzymatic cleavagesite that can be cleaved after purification. See also the discussion ofnucleic acid molecules encoding fusion proteins that may be expressed onthe surface of a cell.

[0244] Other useful protein fusions of the present invention includethose that permit use of the protein of the present invention as bait ina yeast two-hybrid system. See Bartel et al. (eds.), The YeastTwo-Hybrid System, Oxford University Press (1997); Zhu et al., YeastHybrid Technologies, Eaton Publishing (2000); Fields et al, TrendsGenet. 10(8): 286-92 (1994); Mendelsohn et al., Curr. Opin. Biotechnol.5(5): 482-6 (1994); Luban et al., Curr. Opin. Biotechnol. 6(1): 59-64(1995); Allen et al., Trends Biochem. Sci. 20(12): 511-6 (1995); Drees,Curr. Opin. Chem. Biol. 3(1): 64-70 (1999); Topcu et al., Pharm. Res.17(9): 1049-55 (2000); Fashena et al., Gene 250(1-2): 1-14 (2000);;Colas et al., (1996) Genetic selection of peptide aptamers thatrecognize and inhibit cyclin-dependent kinase 2. Nature 380, 548-550;Norman, T. et al., (1999) Genetic selection of peptide inhibitors ofbiological pathways. Science 285, 591-595, Fabbrizio et al., (1999)Inhibition of mammalian cell proliferation by genetically selectedpeptide aptamers that functionally antagonize E2F activity. Oncogene 18,4357-4363; Xu et al., (1997) Cells that register logical relationshipsamong proteins. Proc Natl Acad Sci USA. 94, 12473-12478; Yang, et al.,(1995) Protein-peptide interactions analyzed with the yeast two-hybridsystem. Nuc. Acids Res. 23, 1152-1156; Kolonin et al., (1998) Targetingcyclin-dependent kinases in Drosophila with peptide aptamers. Proc NatlAcad Sci U S A. 95, 14266-14271; Cohen et al., (1998) An artificialcell-cycle inhibitor isolated from a combinatorial library. Proc NatlAcad Sci U S A 95, 14272-14277; Uetz, P.; Giot, L.; al, e.; Fields, S.;Rothberg, J. M. (2000) A comprehensive analysis of protein-proteininteractions in Saccharomyces cerevisiae. Nature 403, 623-627; Ito, etal., (2001) A comprehensive two-hybrid analysis to explore the yeastprotein interactome. Proc Natl Acad Sci U S A 98, 4569-4574, thedisclosures of which are incorporated herein by reference in theirentireties. Typically, such fusion is to either E. coli LexA or yeastGAL4 DNA binding domains. Related bait plasmids are available thatexpress the bait fused to a nuclear localization signal.

[0245] Other useful fusion proteins include those that permit display ofthe encoded protein on the surface of a phage or cell, fusions tointrinsically fluorescent proteins, such as green fluorescent protein(GFP), and fusions to the IgG Fc region, as described above, whichdiscussion is incorporated here by reference in its entirety.

[0246] The polypeptides and fragments of the present invention can alsousefully be fused to protein toxins, such as Pseudomonas exotoxin A,diphtheria toxin, shiga toxin A, anthrax toxin lethal factor, ricin, inorder to effect ablation of cells that bind or take up the proteins ofthe present invention.

[0247] Fusion partners include, inter alia, myc, hemagglutinin (HA),GST, immunoglobulins, β-galactosidase, biotin trpE, protein A,β-lactamase, -amylase, maltose binding protein, alcohol dehydrogenase,polyhistidine (for example, six histidine at the amino and/or carboxylterminus of the polypeptide), lacZ, green fluorescent protein (GFP),yeast_mating factor, GAL4 transcription activation or DNA bindingdomain, luciferase, and serum proteins such as ovalbumin, albumin andthe constant domain of IgG. See, e.g., Ausubel (1992), supra and Ausubel(1999), supra. Fusion proteins may also contain sites for specificenzymatic cleavage, such as a site that is recognized by enzymes such asFactor XIII, trypsin, pepsin, or any other enzyme known in the art.Fusion proteins will typically be made by either recombinant nucleicacid methods, as described above, chemically synthesized usingtechniques well-known in the art (e.g., a Merrifield synthesis), orproduced by chemical cross-linking.

[0248] Another advantage of fusion proteins is that the epitope tag canbe used to bind the fusion protein to a plate or column through anaffinity linkage for screening binding proteins or other molecules thatbind to the BSP.

[0249] As further described below, the isolated polypeptides, muteins,fusion proteins, homologous proteins or allelic variants of the presentinvention can readily be used as specific immunogens to raise antibodiesthat specifically recognize BSPs, their allelic variants and homologues.The antibodies, in turn, can be used, inter alia, specifically to assayfor the polypeptides of the present invention, particularly BSPs, e.g.by ELISA for detection of protein fluid samples, such as serum, byimmunohistochemistry or laser scanning cytometry, for detection ofprotein in tissue samples, or by flow cytometry, for detection ofintracellular protein in cell suspensions, for specificantibody-mediated isolation and/or purification of BSPs, as for exampleby immunoprecipitation, and for use as specific agonists or antagonistsof BSPs.

[0250] One may determine whether polypeptides including muteins, fusionproteins, homologous proteins or allelic variants are functional bymethods known in the art. For instance, residues that are tolerant ofchange while retaining function can be identified by altering theprotein at known residues using methods known in the art, such asalanine scanning mutagenesis, Cunningham et al., Science 244(4908):1081-5 (1989); transposon linker scanning mutagenesis, Chen et al., Gene263(1-2): 39-48 (2001); combinations of homolog- and alanine-scanningmutagenesis, Jin et al., J. Mol. Biol. 226(3): 851-65 (1992);combinatorial alanine scanning, Weiss et al., Proc. Natl. Acad. Sci USA97(16): 8950-4 (2000), followed by functional assay. Transposon linkerscanning kits are available commercially (New England Biolabs, Beverly,Mass., USA, catalog. no. E7-102S; EZ::TN™ In-Frame Linker Insertion Kit,catalogue no. EZI04KN, Epicentre Technologies Corporation, Madison,Wis., USA).

[0251] Purification of the polypeptides including fragments, homologouspolypeptides, muteins, analogs, derivatives and fusion proteins iswell-known and within the skill of one having ordinary skill in the art.See, e.g., Scopes, Protein Purification, 2d ed. (1987). Purification ofrecombinantly expressed polypeptides is described above. Purification ofchemically-synthesized peptides can readily be effected, e.g., by HPLC.

[0252] Accordingly, it is an aspect of the present invention to providethe isolated proteins of the present invention in pure or substantiallypure form in the presence of absence of a stabilizing agent. Stabilizingagents include both proteinaceous or non-proteinaceous material and arewell-known in the art. Stabilizing agents, such as albumin andpolyethylene glycol (PEG) are known and are commercially available.

[0253] Although high levels of purity are preferred when the isolatedproteins of the present invention are used as therapeutic agents, suchas in vaccines and as replacement therapy, the isolated proteins of thepresent invention are also useful at lower purity. For example,partially purified proteins of the present invention can be used asimmunogens to raise antibodies in laboratory animals.

[0254] In preferred embodiments, the purified and substantially purifiedproteins of the present invention are in compositions that lackdetectable ampholytes, acrylamide monomers, bis-acrylamide monomers, andpolyacrylamide.

[0255] The polypeptides, fragments, analogs, derivatives and fusions ofthe present invention can usefully be attached to a substrate. Thesubstrate can be porous or solid, planar or non-planar; the bond can becovalent or noncovalent.

[0256] For example, the polypeptides, fragments, analogs, derivativesand fusions of the present invention can usefully be bound to a poroussubstrate, commonly a membrane, typically comprising nitrocellulose,polyvinylidene fluoride (PVDF), or cationically derivatized, hydrophilicPVDF; so bound, the proteins, fragments, and fusions of the presentinvention can be used to detect and quantify antibodies, e.g. in serum,that bind specifically to the immobilized protein of the presentinvention.

[0257] As another example, the polypeptides, fragments, analogs,derivatives and fusions of the present invention can usefully be boundto a substantially nonporous substrate, such as plastic, to detect andquantify antibodies, e.g. in serum, that bind specifically to theimmobilized protein of the present invention. Such plastics includepolymethylacrylic, polyethylene, polypropylene, polyacrylate,polymethylmethacrylate, polyvinylchloride, polytetrafluoroethylene,polystyrene, polycarbonate, polyacetal, polysulfone, celluloseacetate,cellulosenitrate, nitrocellulose, or mixtures thereof; when the assay isperformed in a standard microtiter dish, the plastic is typicallypolystyrene.

[0258] The polypeptides, fragments, analogs, derivatives and fusions ofthe present invention can also be attached to a substrate suitable foruse as a surface enhanced laser desorption ionization source; soattached, the protein, fragment, or fusion of the present invention isuseful for binding and then detecting secondary proteins that bind withsufficient affinity or avidity to the surface-bound protein to indicatebiologic interaction there between. The proteins, fragments, and fusionsof the present invention can also be attached to a substrate suitablefor use in surface plasmon resonance detection; so attached, theprotein, fragment, or fusion of the present invention is useful forbinding and then detecting secondary proteins that bind with sufficientaffinity or avidity to the surface-bound protein to indicate biologicalinteraction there between.

[0259] Antibodies

[0260] In another aspect, the invention provides antibodies, includingfragments and derivatives thereof, that bind specifically topolypeptides encoded by the nucleic acid molecules of the invention, aswell as antibodies that bind to fragments, muteins, derivatives andanalogs of the polypeptides. In a preferred embodiment, the antibodiesare specific for a polypeptide that is a BSP, or a fragment, mutein,derivative, analog or fusion protein thereof. In a more preferredembodiment, the antibodies are specific for a polypeptide that comprisesSEQ ID NO: 116 through 218, or a fragment, mutein, derivative, analog orfusion protein thereof.

[0261] The antibodies of the present invention can be specific forlinear epitopes, discontinuous epitopes, or conformational epitopes ofsuch proteins or protein fragments, either as present on the protein inits native conformation or, in some cases, as present on the proteins asdenatured, as, e.g., by solubilization in SDS. New epitopes may be alsodue to a difference in post translational modifications (PTMs) indisease versus normal tissue. For example, a particular site on a BSPmay be glycosylated in cancerous cells, but not glycosylated in normalcells or visa versa. In addition, alternative splice forms of a BSP maybe indicative of cancer. Differential degradation of the C or N-terminusof a BSP may also be a marker or target for anticancer therapy. Forexample, a BSP may be N-terminal degraded in cancer cells exposing newepitopes to which antibodies may selectively bind for diagnostic ortherapeutic uses.

[0262] As is well-known in the art, the degree to which an antibody candiscriminate as among molecular species in a mixture will depend, inpart, upon the conformational relatedness of the species in the mixture;typically, the antibodies of the present invention will discriminateover adventitious binding to non-BSP polypeptides by at least 2-fold,more typically by at least 5-fold, typically by more than 10-fold,25-fold, 50-fold, 75-fold, and often by more than 100-fold, and onoccasion by more than 500-fold or 1000-fold. When used to detect theproteins or protein fragments of the present invention, the antibody ofthe present invention is sufficiently specific when it can be used todetermine the presence of the protein of the present invention insamples derived from human breast.

[0263] Typically, the affinity or avidity of an antibody (or antibodymultimer, as in the case of an IgM pentamer) of the present inventionfor a protein or protein fragment of the present invention will be atleast about 1×10⁻⁶ molar (M), typically at least about 5×10⁻⁷M, 1×10⁻⁷M, with affinities and avidities of at least 1×10⁸ M, 5×10⁻⁹ M, 1×10⁻¹⁰M and up to 1×10⁻¹³ M proving especially useful.

[0264] The antibodies of the present invention can benaturally-occurring forms, such as IgG, IgM, IgD, IgE, IgY, and IgA,from any avian, reptilian, or mammalian species.

[0265] Human antibodies can, but will infrequently, be drawn directlyfrom human donors or human cells. In this case, antibodies to theproteins of the present invention will typically have resulted fromfortuitous immunization, such as autoimmune immunization, with theprotein or protein fragments of the present invention. Such antibodieswill typically, but will not invariably, be polyclonal. In addition,individual polyclonal antibodies may be isolated and cloned to generatemonoclonals.

[0266] Human antibodies are more frequently obtained using transgenicanimals that express human immunoglobulin genes, which transgenicanimals can be affirmatively immunized with the protein immunogen of thepresent invention. Human Ig-transgenic mice capable of producing humanantibodies and methods of producing human antibodies therefrom uponspecific immunization are described, inter alia, in U.S. Pat. Nos.6,162,963; 6,150,584; 6,114,598; 6,075,181; 5,939,598; 5,877,397;5,874,299; 5,814,318; 5,789,650; 5,770,429; 5,661,016; 5,633,425;5,625,126; 5,569,825; 5,545,807; 5,545,806, and 5,591,669, thedisclosures of which are incorporated herein by reference in theirentireties. Such antibodies are typically monoclonal, and are typicallyproduced using techniques developed for production of murine antibodies.

[0267] Human antibodies are particularly useful, and often preferred,when the antibodies of the present invention are to be administered tohuman beings as in vivo diagnostic or therapeutic agents, sincerecipient immune response to the administered antibody will often besubstantially less than that occasioned by administration of an antibodyderived from another species, such as mouse.

[0268] IgG, IgM, IgD, IgE, IgY, and IgA antibodies of the presentinvention can also be obtained from other species, including mammalssuch as rodents (typically mouse, but also rat, guinea pig, and hamster)lagomorphs, typically rabbits, and also larger mammals, such as sheep,goats, cows, and horses, and other egg laying birds or reptiles such aschickens or alligators. For example, avian antibodies may be generatedusing techniques described in WO 00/29444, published May 25, 2000, thecontents of which are hereby incorporated in their entirety. In suchcases, as with the transgenic human-antibody-producing non-humanmammals, fortuitous immunization is not required, and the non-humanmammal is typically affirmatively immunized, according to standardimmunization protocols, with the protein or protein fragment of thepresent invention.

[0269] As discussed above, virtually all fragments of 8 or morecontiguous amino acids of the proteins of the present invention can beused effectively as immunogens when conjugated to a carrier, typically aprotein such as bovine thyroglobulin, keyhole limpet hemocyanin, orbovine serum albumin, conveniently using a bifunctional linker such asthose described elsewhere above, which discussion is incorporated byreference here.

[0270] Inmunogenicity can also be conferred by fusion of the polypeptideand fragments of the present invention to other moieties. For example,peptides of the present invention can be produced by solid phasesynthesis on a branched polylysine core matrix; these multiple antigenicpeptides (MAPs) provide high purity, increased avidity, accuratechemical definition and improved safety in vaccine development. Tam etal., Proc. Natl. Acad. Sci. USA 85: 5409-5413 (1988); Posnett et al., J.Biol. Chem. 263: 1719-1725 (1988).

[0271] Protocols for immunizing non-human mammals or avian species arewell-established in the art. See Harlow et al. (eds.), Using Antibodies:A Laboratory Manual, Cold Spring Harbor Laboratory (1998); Coligan etal. (eds.), Current Protocols in Immunology, John Wiley & Sons, Inc.(2001); Zola, Monoclonal Antibodies: Preparation and Use of MonoclonalAntibodies and Engineered Antibody Derivatives (Basics: From Backgroundto Bench), Springer Verlag (2000); Gross M, Speck J. Dtsch. Tierarztl.Wochenschr. 103: 417-422 (1996), the disclosures of which areincorporated herein by reference. Immunization protocols often includemultiple immunizations, either with or without adjuvants such asFreund's complete adjuvant and Freund's incomplete adjuvant, and mayinclude naked DNA immunization (Moss, Semin. Immunol. 2: 317-327 (1990).

[0272] Antibodies from non-human mammals and avian species can bepolyclonal or monoclonal, with polyclonal antibodies having certainadvantages in immunohistochemical detection of the proteins of thepresent invention and monoclonal antibodies having advantages inidentifying and distinguishing particular epitopes of the proteins ofthe present invention. Antibodies from avian species may have particularadvantage in detection of the proteins of the present invention, inhuman serum or tissues (Vikinge et al., Biosens. Bioelectron. 13:1257-1262 (1998).

[0273] Following immunization, the antibodies of the present inventioncan be produced using any art-accepted technique. Such techniques arewell-known in the art, Coligan, supra; Zola, supra; Howard et al.(eds.), Basic Methods in Antibody Production and Characterization, CRCPress (2000); Harlow, supra; Davis (ed.), Monoclonal Antibody Protocols,Vol. 45, Humana Press (1995); Delves (ed.), Antibody Production:Essential Techniques, John Wiley & Son Ltd (1997); Kenney, AntibodySolution: An Antibody Methods Manual, Chapman & Hall (1997),incorporated herein by reference in their entireties, and thus need notbe detailed here.

[0274] Briefly, however, such techniques include, inter alia, productionof monoclonal antibodies by hybridomas and expression of antibodies orfragments or derivatives thereof from host cells engineered to expressimmunoglobulin genes or fragments thereof. These two methods ofproduction are not mutually exclusive: genes encoding antibodiesspecific for the proteins or protein fragments of the present inventioncan be cloned from hybridomas and thereafter expressed in other hostcells. Nor need the two necessarily be performed together: e.g., genesencoding antibodies specific for the proteins and protein fragments ofthe present invention can be cloned directly from B cells known to bespecific for the desired protein, as further described in U.S Pat. No.5,627,052, the disclosure of which is incorporated herein by referencein its entirety, or from antibody-displaying phage.

[0275] Recombinant expression in host cells is particularly useful whenfragments or derivatives of the antibodies of the present invention aredesired.

[0276] Host cells for recombinant production of either whole antibodies,antibody fragments, or antibody derivatives can be prokaryotic oreukaryotic.

[0277] Prokaryotic hosts are particularly useful for producing phagedisplayed antibodies of the present invention.

[0278] The technology of phage-displayed antibodies, in which antibodyvariable region fragments are fused, for example, to the gene IIIprotein (PIII) or gene VIII protein (pVIII) for display on the surfaceof filamentous phage, such as M13, is by now well-established. See,e.g., Sidhu, Curr. Opin. Biotechnol. 11(6): 610-6 (2000); Griffiths etal., Curr. Opin. Biotechnol. 9(1): 102-8 (1998); Hoogenboom et al.,Immunotechnology, 4(1): 1-20 (1998); Rader et al., Current Opinion inBiotechnology 8: 503-508 (1997); Aujame et al., Human Antibodies 8:155-168 (1997); Hoogenboom, Trends in Biotechnol. 15: 62-70 (1997); deKruif et al., 17: 453-455 (1996); Barbas et al., Trends in Biotechnol.14: 230-234 (1996); Winter et al., Ann. Rev. Immunol. 433-455 (1994).Techniques and protocols required to generate, propagate, screen (pan),and use the antibody fragments from such libraries have recently beencompiled. See, e.g., Barbas (2001), supra; Kay, supra; Abelson, supra,the disclosures of which are incorporated herein by reference in theirentireties.

[0279] Typically, phage-displayed antibody fragments are scFv fragmentsor Fab fragments; when desired, full length antibodies can be producedby cloning the variable regions from the displaying phage into acomplete antibody and expressing the full length antibody in a furtherprokaryotic or a eukaryotic host cell.

[0280] Eukaryotic cells are also useful for expression of theantibodies, antibody fragments, and antibody derivatives of the presentinvention.

[0281] For example, antibody fragments of the present invention can beproduced in Pichia pastoris and in Saccharomyces cerevisiae. See, e.g.,Takahashi et al., Biosci. Biotechnol. Biochem. 64(10): 2138-44 (2000);Freyre et al., J. Biotechnol. 76(2-3):1 57-63 (2000); Fischer et al.,Biotechnol. Appl. Biochem. 30 (Pt 2): 117-20 (1999); Pennell et al.,Res. Immunol. 149(6): 599-603 (1998); Eldin et al., J. Immunol. Methods.201(1): 67-75 (1997);, Frenken et al., Res. Immunol. 149(6): 589-99(1998); Shusta et al., Nature Biotechnol. 16(8): 773-7 (1998), thedisclosures of which are incorporated herein by reference in theirentireties.

[0282] Antibodies, including antibody fragments and derivatives, of thepresent invention can also be produced in insect cells. See, e.g., Li etal., Protein Expr. Purif. 21(1): 121-8 (2001); Ailor et al., Biotechnol.Bioeng. 58(2-3): 196-203 (1998);Hsu et al., Biotechnol. Prog. 13(1):96-104 (1997); Edelman et al., Immunology 91(1): 13-9 (1997); and Nesbitet al., J. Immunol. Methods 151(1-2): 201-8 (1992), the disclosures ofwhich are incorporated herein by reference in their entireties.

[0283] Antibodies and fragments and derivatives thereof of the presentinvention can also be produced in plant cells, particularly maize ortobacco, Giddings et al., Nature Biotechnol. 18(11): 1151-5 (2000);Gavilondo et al., Biotechniques 29(1): 128-38 (2000); Fischer et al., J.Biol. Regul. Homeost. Agents 14(2): 83-92 (2000); Fischer et al.,Biotechnol. Appl. Biochem. 30 (Pt 2): 113-6 (1999); Fischer et al.,Biol. Chem. 380(7-8): 825-39 (1999); Russell, Curr. Top. Microbiol.Immunol. 240: 119-38 (1999); and Ma et al., Plant Physiol. 109(2): 341-6(1995), the disclosures of which are incorporated herein by reference intheir entireties.

[0284] Antibodies, including antibody fragments and derivatives, of thepresent invention can also be produced in transgenic, non-human,mammalian milk. See, e.g. Pollock et al., J. Immunol Methods. 231:147-57 (1999); Young et al., Res. Immunol. 149: 609-10 (1998); Limontaet al., Immunotechnology 1: 107-13 (1995), the disclosures of which areincorporated herein by reference in their entireties.

[0285] Mammalian cells useful for recombinant expression of antibodies,antibody fragments, and antibody derivatives of the present inventioninclude CHO cells, COS cells, 293 cells, and myeloma cells.

[0286] Verma et al., J. Immunol. Methods 216(1-2):165-81 (1998), hereinincorporated by reference, review and compare bacterial, yeast, insectand mammalian expression systems for expression of antibodies.

[0287] Antibodies of the present invention can also be prepared by cellfree translation, as further described in Merk et al., J. Biochem.(Tokyo) 125(2): 328-33 (1999) and Ryabova et al., Nature Biotechnol.15(1): 79-84 (1997), and in the milk of transgenic animals, as furtherdescribed in Pollock et al., J. Immunol. Methods 231(1-2): 147-57(1999), the disclosures of which are incorporated herein by reference intheir entireties.

[0288] The invention further provides antibody fragments that bindspecifically to one or more of the proteins and protein fragments of thepresent invention, to one or more of the proteins and protein fragmentsencoded by the isolated nucleic acids of the present invention, or thebinding of which can be competitively inhibited by one or more of theproteins and protein fragments of the present invention or one or moreof the proteins and protein fragments encoded by the isolated nucleicacids of the present invention.

[0289] Among such useful fragments are Fab, Fab′, Fv, F(ab)′₂, andsingle chain Fv (scFv) fragments. Other useful fragments are describedin Hudson, Curr. Opin. Biotechnol. 9(4): 395-402 (1998).

[0290] It is also an aspect of the present invention to provide antibodyderivatives that bind specifically to one or more of the proteins andprotein fragments of the present invention, to one or more of theproteins and protein fragments encoded by the isolated nucleic acids ofthe present invention, or the binding of which can be competitivelyinhibited by one or more of the proteins and protein fragments of thepresent invention or one or more of the proteins and protein fragmentsencoded by the isolated nucleic acids of the present invention.

[0291] Among such useful derivatives are chimeric, primatized, andhumanized antibodies; such derivatives are less immunogenic in humanbeings, and thus more suitable for in vivo administration, than areunmodified antibodies from non-human mammalian species. Another usefulderivative is PEGylation to increase the serum half life of theantibodies.

[0292] Chimeric antibodies typically include heavy and/or light chainvariable regions (including both CDR and framework residues) ofimmunoglobulins of one species, typically mouse, fused to constantregions of another species, typically human. See, e.g., U.S. Pat. No.No. 5,807,715; Morrison et al., Proc. Natl. Acad. Sci USA.81(21): 6851-5(1984); Sharon et al., Nature 309(5966): 364-7 (1984); Takeda et al.,Nature 314(6010): 452-4 (1985), the disclosures of which areincorporated herein by reference in their entireties. Primatized andhumanized antibodies typically include heavy and/or light chain CDRsfrom a murine antibody grafted into a non-human primate or humanantibody V region framework, usually further comprising a human constantregion, Riechmann et al., Nature 332(6162): 323-7 (1988); Co et al.,Nature 351(6326): 501-2 (1991); U.S. Pat. Nos. 6,054,297; 5,821,337;5,770,196; 5,766,886; 5,821,123; 5,869,619; 6,180,377; 6,013,256;5,693,761; and 6,180,370, the disclosures of which are incorporatedherein by reference in their entireties.

[0293] Other useful antibody derivatives of the invention includeheteromeric antibody complexes and antibody fusions, such as diabodies(bispecific antibodies), single-chain diabodies, and intrabodies.

[0294] It is contemplated that the nucleic acids encoding the antibodiesof the present invention can be operably joined to other nucleic acidsforming a recombinant vector for cloning or for expression of theantibodies of the invention. The present invention includes anyrecombinant vector containing the coding sequences, or part thereof,whether for eukaryotic transduction, transfection or gene therapy. Suchvectors may be prepared using conventional molecular biology techniques,known to those with skill in the art, and would comprise DNA encodingsequences for the immunoglobulin V-regions including framework and CDRsor parts thereof, and a suitable promoter either with or without asignal sequence for intracellular transport. Such vectors may betransduced or transfected into eukaryotic cells or used for gene therapy(Marasco et al., Proc. Natl. Acad. Sci. (USA) 90: 7889-7893 (1993); Duanet al., Proc. Natl. Acad. Sci. (USA) 91: 5075-5079 (1994), byconventional techniques, known to those with skill in the art.

[0295] The antibodies of the present invention, including fragments andderivatives thereof, can usefully be labeled. It is, therefore, anotheraspect of the present invention to provide labeled antibodies that bindspecifically to one or more of the proteins and protein fragments of thepresent invention, to one or more of the proteins and protein fragmentsencoded by the isolated nucleic acids of the present invention, or thebinding of which can be competitively inhibited by one or more of theproteins and protein fragments of the present invention or one or moreof the proteins and protein fragments encoded by the isolated nucleicacids of the present invention.

[0296] The choice of label depends, in part, upon the desired use.

[0297] For example, when the antibodies of the present invention areused for immunohistochemical staining of tissue samples, the label ispreferably an enzyme that catalyzes production and local deposition of adetectable product.

[0298] Enzymes typically conjugated to antibodies to permit theirimmunohistochemical visualization are well-known, and include alkalinephosphatase, β-galactosidase, glucose oxidase, horseradish peroxidase(HRP), and urease. Typical substrates for production and deposition ofvisually detectable products includeo-nitrophenyl-beta-D-galactopyranoside (ONPG); o-phenylenediaminedihydrochloride (OPD); p-nitrophenyl phosphate (PNPP);p-nitrophenyl-beta-D-galactopryanoside (PNPG); 3′, 3′-diaminobenzidine(DAB); 3-amino-9-ethylcarbazole (AEC); 4-chloro-1-naphthol (CN);5-bromo-4-chloro-3-indolyl-phosphate (BCIP); ABTS®; BluoGal;iodonitrotetrazolium (INT); nitroblue tetrazolium chloride (NBT);phenazine methosulfate (PMS); phenolphthalein monophosphate (PMP);tetramethyl benzidine (TMB); tetranitroblue tetrazolium (TNBT); X-Gal;X-Gluc; and X-Glucoside.

[0299] Other substrates can be used to produce products for localdeposition that are luminescent. For example, in the presence ofhydrogen peroxide (H₂O₂), horseradish peroxidase (HRP) can catalyze theoxidation of cyclic diacylhydrazides, such as luminol. Immediatelyfollowing the oxidation, the luminol is in an excited state(intermediate reaction product), which decays to the ground state byemitting light. Strong enhancement of the light emission is produced byenhancers, such as phenolic compounds. Advantages include highsensitivity, high resolution, and rapid detection without radioactivityand requiring only small amounts of antibody. See, e.g., Thorpe et al.,Methods Enzymol. 133: 331-53 (1986); Kricka et al., J. Immunoassay17(1): 67-83 (1996); and Lundqvist et al., J. Biolumin. Chemilumin.10(6): 353-9 (1995), the disclosures of which are incorporated herein byreference in their entireties. Kits for such enhanced chemiluminescentdetection (ECL) are available commercially.

[0300] The antibodies can also be labeled using colloidal gold.

[0301] As another example, when the antibodies of the present inventionare used, e.g., for flow cytometric detection, for scanning lasercytometric detection, or for fluorescent immunoassay, they can usefullybe labeled with fluorophores.

[0302] There are a wide variety of fluorophore labels that can usefullybe attached to the antibodies of the present invention.

[0303] For flow cytometric applications, both for extracellulardetection and for intracellular detection, common useful fluorophorescan be fluorescein isothiocyanate (FITC), allophycocyanin (APC),R-phycoerythrin (PE), peridinin chlorophyll protein (PerCP), Texas Red,Cy3, Cy5, fluorescence resonance energy tandem fluorophores such asPerCP-Cy5.5, PE-CyS, PE-Cy5.5, PE-Cy7, PE-Texas Red, and APC-Cy7.

[0304] Other fluorophores include, inter alia, Alexa Fluor® 350, AlexaFluor® 488, Alexa Fluor® 532, Alexa Fluor® 546, Alexa Fluor® 568, AlexaFluora® 594, Alexa Fluor® 647 (monoclonal antibody labeling kitsavailable from Molecular Probes, Inc., Eugene, Oreg., USA), BODIPY dyes,such as BODIPY 493/503, BODIPY FL, BODIPY R6G, BODIPY 530/550, BODIPYTMR, BODIPY 558/568, BODIPY 558/568, BODIPY 564/570, BODIPY 576/589,BODIPY 581/591, BODIPY TR, BODIPY 630/650, BODIPY 650/665, Cascade Blue,Cascade Yellow, Dansyl, lissamine rhodamine B, Marina Blue, Oregon Green488, Oregon Green 514, Pacific Blue, rhodamine 6G, rhodamine green,rhodamine red, tetramethylrhodamine, Texas Red (available from MolecularProbes, Inc., Eugene, Oreg., USA), and Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7,all of which are also useful for fluorescently labeling the antibodiesof the present invention.

[0305] For secondary detection using labeled avidin, streptavidin,captavidin or neutravidin, the antibodies of the present invention canusefully be labeled with biotin.

[0306] When the antibodies of the present invention are used, e.g., forWestern blotting applications, they can usefully be labeled withradioisotopes, such as ³³P, ³²P, ³⁵S, ³H, and ¹²⁵I.

[0307] As another example, when the antibodies of the present inventionare used for radioimmunotherapy, the label can usefully be ²²⁸Th, ²²⁷Ac,²²⁵Ac, ²²³Ra, ²¹³Bi, ²¹²Pb, ²¹² Bi, ²¹¹At, ²⁰³Pb, ¹⁹⁴OS, ¹⁸⁸Re, ¹⁸⁶Re,¹⁵³SM, ¹⁴⁹Tb, ¹³¹I, ¹²⁵I, ¹¹¹In, ¹⁰⁵Rh, ^(99m)Tc, ⁹⁷Ru, ⁹⁰Y, ⁹⁰Sr, ⁸⁸Y,⁷²Se, ⁶⁷Cu, or ⁴⁷Sc.

[0308] As another example, when the antibodies of the present inventionare to be used for in vivo diagnostic use, they can be rendereddetectable by conjugation to MRI contrast agents, such as gadoliniumdiethylenetriaminepentaacetic acid (DTPA), Lauffer et al., Radiology207(2): 529-38 (1998), or by radioisotopic labeling.

[0309] As would be understood, use of the labels described above is notrestricted to the application for which they are mentioned.

[0310] The antibodies of the present invention, including fragments andderivatives thereof, can also be conjugated to toxins, in order totarget the toxin's ablative action to cells that display and/or expressthe proteins of the present invention. Commonly, the antibody in suchimmunotoxins is conjugated to Pseudomonas exotoxin A, diphtheria toxin,shiga toxin A, anthrax toxin lethal factor, or ricin. See Hall (ed.),Immunotoxin Methods and Protocols (Methods in Molecular Biology, vol.166), Humana Press (2000); and Frankel et al. (eds.), ClinicalApplications of Immunotoxins, Springer-Verlag (1998), the disclosures ofwhich are incorporated herein by reference in their entireties.

[0311] The antibodies of the present invention can usefully be attachedto a substrate, and it is, therefore, another aspect of the invention toprovide antibodies that bind specifically to one or more of the proteinsand protein fragments of the present invention, to one or more of theproteins and protein fragments encoded by the isolated nucleic acids ofthe present invention, or the binding of which can be competitivelyinhibited by one or more of the proteins and protein fragments of thepresent invention or one or more of the proteins and protein fragmentsencoded by the isolated nucleic acids of the present invention, attachedto a substrate.

[0312] Substrates can be porous or nonporous, planar or nonplanar.

[0313] For example, the antibodies of the present invention can usefullybe conjugated to filtration media, such as NHS-activated Sepharose orCNBr-activated Sepharose for purposes of immunoaffinity chromatography.

[0314] For example, the antibodies of the present invention can usefullybe attached to paramagnetic microspheres, typically bybiotin-streptavidin interaction, which microspheres can then be used forisolation of cells that express or display the proteins of the presentinvention. As another example, the antibodies of the present inventioncan usefully be attached to the surface of a microtiter plate for ELISA.

[0315] As noted above, the antibodies of the present invention can beproduced in prokaryotic and eukaryotic cells. It is, therefore, anotheraspect of the present invention to provide cells that express theantibodies of the present invention, including hybridoma cells, B cells,plasma cells, and host cells recombinantly modified to express theantibodies of the present invention.

[0316] In yet a further aspect, the present invention provides aptamersevolved to bind specifically to one or more of the proteins and proteinfragments of the present invention, to one or more of the proteins andprotein fragments encoded by the isolated nucleic acids of the presentinvention, or the binding of which can be competitively inhibited by oneor more of the proteins and protein fragments of the present inventionor one or more of the proteins and protein fragments encoded by theisolated nucleic acids of the present invention.

[0317] In sum, one of skill in the art, provided with the teachings ofthis invention, has available a variety of methods which may be used toalter the biological properties of the antibodies of this inventionincluding methods which would increase or decrease the stability orhalf-life, immunogenicity, toxicity, affinity or yield of a givenantibody molecule, or to alter it in any other way that may render itmore suitable for a particular application.

[0318] Transgenic Animals and Cells

[0319] In another aspect, the invention provides transgenic cells andnon-human organisms comprising nucleic acid molecules of the invention.In a preferred embodiment, the transgenic cells and non-human organismscomprise a nucleic acid molecule encoding a BSP. In a preferredembodiment, the BSP comprises an amino acid sequence selected from SEQID NO: 116 through 218, or a fragment, mutein, homologous protein orallelic variant thereof. In another preferred embodiment, the transgeniccells and non-human organism comprise a BSNA of the invention,preferably a BSNA comprising a nucleotide sequence selected from thegroup consisting of SEQ ID NO: 1 through 115, or a part, substantiallysimilar nucleic acid molecule, allelic variant or hybridizing nucleicacid molecule thereof.

[0320] In another embodiment, the transgenic cells and non-humanorganisms have a targeted disruption or replacement of the endogenousorthologue of the human BSG. The transgenic cells can be embryonic stemcells or somatic cells. The transgenic non-human organisms can bechimeric, nonchimeric heterozygotes, and nonchimeric homozygotes.Methods of producing transgenic animals are well-known in the art. See,e.g., Hogan et al., Manipulating the Mouse Embryo: A Laboratory Manual,2d ed., Cold Spring Harbor Press (1999); Jackson et al., Mouse Geneticsand Transgenics: A Practical Approach, Oxford University Press (2000);and Pinkert, Transgenic Animal Technology: A Laboratory Handbook,Academic Press (1999).

[0321] Any technique known in the art may be used to introduce a nucleicacid molecule of the invention into an animal to produce the founderlines of transgenic animals. Such techniques include, but are notlimited to, pronuclear microinjection. (see, e.g., Paterson et al.,Appl. Microbiol. Biotechnol. 40: 691-698 (1994); Carver et al.,Biotechnology 11: 1263-1270 (1993); Wright et al., Biotechnology 9:830-834 (1991); and U.S. Pat. No. 4,873,191 (1989 retrovirus-mediatedgene transfer into germ lines, blastocysts or embryos (see, e.g., Vander Putten et al., Proc. Natl. Acad. Sci., USA 82: 6148-6152 (1985));gene targeting in embryonic stem cells (see, e.g., Thompson et al, Cell56: 313-321 (1989)); electroporation of cells or embryos (see, e.g., Lo,1983, Mol. Cell. Biol. 3: 1803-1814 (1983)); introduction using a genegun (see, e.g., Ulmer et al., Science 259: 1745-49 (1993); introducingnucleic acid constructs into embryonic pleuripotent stem cells andtransferring the stem cells back into the blastocyst; and sperm-mediatedgene transfer (see, e.g., Lavitrano et al., Cell 57: 717-723 (1989)).

[0322] Other techniques include, for example, nuclear transfer intoenucleated oocytes of nuclei from cultured embryonic, fetal, or adultcells induced to quiescence (see, e.g., Campell et al., Nature 380:64-66 (1996); Wilmut et al., Nature 385: 810-813 (1997)). The presentinvention provides for transgenic animals that carry the transgene(i.e., a nucleic acid molecule of the invention) in all their cells, aswell as animals which carry the transgene in some, but not all theircells, i. e., mosaic animals or chimeric animals.

[0323] The transgene may be integrated as a single transgene or asmultiple copies, such as in concatamers, e. g., head-to-head tandems orhead-to-tail tandems. The transgene may also be selectively introducedinto and activated in a particular cell type by following, e.g., theteaching of Lasko et al. et al., Proc. Natl. Acad. Sci. USA 89:6232-6236 (1992). The regulatory sequences required for such a cell-typespecific activation will depend upon the particular cell type ofinterest, and will be apparent to those of skill in the art.

[0324] Once transgenic animals have been generated, the expression ofthe recombinant gene may be assayed utilizing standard techniques.Initial screening may be accomplished by Southern blot analysis or PCRtechniques to analyze animal tissues to verify that integration of thetransgene has taken place. The level of mRNA expression of the transgenein the tissues of the transgenic animals may also be assessed usingtechniques which include, but are not limited to, Northern blot analysisof tissue samples obtained from the animal, in situ hybridizationanalysis, and reverse transcriptase-PCR (RT-PCR). Samples of transgenicgene-expressing tissue may also be evaluated immunocytochemically orimmunohistochemically using antibodies specific for the transgeneproduct.

[0325] Once the founder animals are produced, they may be bred, inbred,outbred, or crossbred to produce colonies of the particular animal.Examples of such breeding strategies include, but are not limited to:outbreeding of founder animals with more than one integration site inorder to establish separate lines; inbreeding of separate lines in orderto produce compound transgenics that express the transgene at higherlevels because of the effects of additive expression of each transgene;crossing of heterozygous transgenic animals to produce animalshomozygous for a given integration site in order to both augmentexpression and eliminate the need for screening of animals by DNAanalysis; crossing of separate homozygous lines to produce compoundheterozygous or homozygous lines; and breeding to place the transgene ona distinct background that is appropriate for an experimental model ofinterest.

[0326] Transgenic animals of the invention have uses which include, butare not limited to, animal model systems useful in elaborating thebiological function of polypeptides of the present invention, studyingconditions and/or disorders associated with aberrant expression, and inscreening for compounds effective in ameliorating such conditions and/ordisorders.

[0327] Methods for creating a transgenic animal with a disruption of atargeted gene are also well-known in the art. In general, a vector isdesigned to comprise some nucleotide sequences homologous to theendogenous targeted gene. The vector is introduced into a cell so thatit may integrate, via homologous recombination with chromosomalsequences, into the endogenous gene, thereby disrupting the function ofthe endogenous gene. The transgene may also be selectively introducedinto a particular cell type, thus inactivating the endogenous gene inonly that cell type. See, e.g., Gu et al., Science 265: 103-106 (1994).The regulatory sequences required for such a cell-type specificinactivation will depend upon the particular cell type of interest, andwill be apparent to those of skill in the art. See, e.g., Smithies etal., Nature 317: 230-234 (1985); Thomas et al., Cell 51: 503-512 (1987);Thompson et al., Cell 5: 313-321 (1989).

[0328] In one embodiment, a mutant, non-functional nucleic acid moleculeof the invention (or a completely unrelated DNA sequence) flanked by DNAhomologous to the endogenous nucleic acid sequence (either the codingregions or regulatory regions of the gene) can be used, with or withouta selectable marker and/or a negative selectable marker, to transfectcells that express polypeptides of the invention in vivo. In anotherembodiment, techniques known in the art are used to generate knockoutsin cells that contain, but do not express the gene of interest.Insertion of the DNA construct, via targeted homologous recombination,results in inactivation of the targeted gene. Such approaches areparticularly suited in research and agricultural fields wheremodifications to embryonic stem cells can be used to generate animaloffspring with an inactive targeted gene. See, e.g., Thomas, supra andThompson, supra. However this approach can be routinely adapted for usein humans provided the recombinant DNA constructs are directlyadministered or targeted to the required site in vivo using appropriateviral vectors that will be apparent to those of skill in the art.

[0329] In further embodiments of the invention, cells that aregenetically engineered to express the polypeptides of the invention, oralternatively, that are genetically engineered not to express thepolypeptides of the invention (e.g., knockouts) are administered to apatient in vivo. Such cells may be obtained from an animal or patient oran MHC compatible donor and can include, but are not limited tofibroblasts, bone marrow cells, blood cells (e.g., lymphocytes),adipocytes, muscle cells, endothelial cells etc. The cells aregenetically engineered in vitro using recombinant DNA techniques tointroduce the coding sequence of polypeptides of the invention into thecells, or alternatively, to disrupt the coding sequence and/orendogenous regulatory sequence associated with the polypeptides of theinvention, e.g., by transduction (using viral vectors, and preferablyvectors that integrate the transgene into the cell genome) ortransfection procedures, including, but not limited to, the use ofplasmids, cosmids, YACs, naked DNA, electroporation, liposomes, etc.

[0330] The coding sequence of the polypeptides of the invention can beplaced under the control of a strong constitutive or inducible promoteror promoter/enhancer to achieve expression, and preferably secretion, ofthe polypeptides of the invention. The engineered cells which expressand preferably secrete the polypeptides of the invention can beintroduced into the patient systemically, e.g., in the circulation, orintraperitoneally.

[0331] Alternatively, the cells can be incorporated into a matrix andimplanted in the body, e.g., genetically engineered fibroblasts can beimplanted as part of a skin graft; genetically engineered endothelialcells can be implanted as part of a lymphatic or vascular graft. See,e.g., U.S. Pat. Nos. 5,399,349 and 5,460,959, each of which isincorporated by reference herein in its entirety.

[0332] When the cells to be administered are non-autologous or non-MHCcompatible cells, they can be administered using well-known techniqueswhich prevent the development of a host immune response against theintroduced cells. For example, the cells may be introduced in anencapsulated form which, while allowing for an exchange of componentswith the immediate extracellular environment, does not allow theintroduced cells to be recognized by the host immune system.

[0333] Transgenic and “knock-out” animals of the invention have useswhich include, but are not limited to, animal model systems useful inelaborating the biological function of polypeptides of the presentinvention, studying conditions and/or disorders associated with aberrantexpression, and in screening for compounds effective in amelioratingsuch conditions and/or disorders.

[0334] Computer Readable Means

[0335] A further aspect of the invention relates to a computer readablemeans for storing the nucleic acid and amino acid sequences of theinstant invention. In a preferred embodiment, the invention provides acomputer readable means for storing SEQ ID NO: 1 through 115 and SEQ IDNO: 116 through 218 as described herein, as the complete set ofsequences or in any combination. The records of the computer readablemeans can be accessed for reading and display and for interface with acomputer system for the application of programs allowing for thelocation of data upon a query for data meeting certain criteria, thecomparison of sequences, the alignment or ordering of sequences meetinga set of criteria, and the like.

[0336] The nucleic acid and amino acid sequences of the invention areparticularly useful as components in databases useful for searchanalyses as well as in sequence analysis algorithms. As used herein, theterms “nucleic acid sequences of the invention” and “amino acidsequences of the invention” mean any detectable chemical or physicalcharacteristic of a polynucleotide or polypeptide of the invention thatis or may be reduced to or stored in a computer readable form. Theseinclude, without limitation, chromatographic scan data or peak data,photographic data or scan data therefrom, and mass spectrographic data.

[0337] This invention provides computer readable media having storedthereon sequences of the invention. A computer readable medium maycomprise one or more of the following: a nucleic acid sequencecomprising a sequence of a nucleic acid sequence of the invention; anamino acid sequence comprising an amino acid sequence of the invention;a set of nucleic acid sequences wherein at least one of said sequencescomprises the sequence of a nucleic acid sequence of the invention; aset of amino acid sequences wherein at least one of said sequencescomprises the sequence of an amino acid sequence of the invention; adata set representing a nucleic acid sequence comprising the sequence ofone or more nucleic acid sequences of the invention; a data setrepresenting a nucleic acid sequence encoding an amino acid sequencecomprising the sequence of an amino acid sequence of the invention; aset of nucleic acid sequences wherein at least one of said sequencescomprises the sequence of a nucleic acid sequence of the invention; aset of amino acid sequences wherein at least one of said sequencescomprises the sequence of an amino acid sequence of the invention; adata set representing a nucleic acid sequence comprising the sequence ofa nucleic acid sequence of the invention; a data set representing anucleic acid sequence encoding an amino acid sequence comprising thesequence of an amino acid sequence of the invention. The computerreadable medium can be any composition of matter used to storeinformation or data, including, for example, commercially availablefloppy disks, tapes, hard drives, compact disks, and video disks.

[0338] Also provided by the invention are methods for the analysis ofcharacter sequences, particularly genetic sequences. Preferred methodsof sequence analysis include, for example, methods of sequence homologyanalysis, such as identity and similarity analysis, RNA structureanalysis, sequence assembly, cladistic analysis, sequence motifanalysis, open reading frame determination, nucleic acid base calling,and sequencing chromatogram peak analysis.

[0339] A computer-based method is provided for performing nucleic acidsequence identity or similarity identification. This method comprisesthe steps of providing a nucleic acid sequence comprising the sequenceof a nucleic acid of the invention in a computer readable medium; andcomparing said nucleic acid sequence to at least one nucleic acid oramino acid sequence to identify sequence identity or similarity.

[0340] A computer-based method is also provided for performing aminoacid homology identification, said method comprising the steps of:providing an amino acid sequence comprising the sequence of an aminoacid of the invention in a computer readable medium; and comparing saidan amino acid sequence to at least one nucleic acid or an amino acidsequence to identify homology.

[0341] A computer-based method is still further provided for assembly ofoverlapping nucleic acid sequences into a single nucleic acid sequence,said method comprising the steps of: providing a first nucleic acidsequence comprising the sequence of a nucleic acid of the invention in acomputer readable medium; and screening for at least one overlappingregion between said first nucleic acid sequence and a second nucleicacid sequence.

[0342] Diagnostic Methods for Breast Cancer

[0343] The present invention also relates to quantitative andqualitative diagnostic assays and methods for detecting, diagnosing,monitoring, staging and predicting cancers by comparing expression of aBSNA or a BSP in a human patient that has or may have breast cancer, orwho is at risk of developing breast cancer, with the expression of aBSNA or a BSP in a normal human control. For purposes of the presentinvention, “expression of a BSNA” or “BSNA expression” means thequantity of BSG mRNA that can be measured by any method known in the artor the level of transcription that can be measured by any method knownin the art in a cell, tissue, organ or whole patient. Similarly, theterm “expression of a BSP” or “BSP expression” means the amount of BSPthat can be measured by any method known in the art or the level oftranslation of a BSG BSNA that can be measured by any method known inthe art.

[0344] The present invention provides methods for diagnosing breastcancer in a patient, in particular squamous cell carcinoma, by analyzingfor changes in levels of BSNA or BSP in cells, tissues, organs or bodilyfluids compared with levels of BSNA or BSP in cells, tissues, organs orbodily fluids of preferably the same type from a normal human control,wherein an increase, or decrease in certain cases, in levels of a BSNAor BSP in the patient versus the normal human control is associated withthe presence of breast cancer or with a predilection to the disease. Inanother preferred embodiment, the present invention provides methods fordiagnosing breast cancer in a patient by analyzing changes in thestructure of the mRNA of a BSG compared to the mRNA from a normalcontrol. These changes include, without limitation, aberrant splicing,alterations in polyadenylation and/or alterations in 5′ nucleotidecapping. In yet another preferred embodiment, the present inventionprovides methods for diagnosing breast cancer in a patient by analyzingchanges in a BSP compared to a BSP from a normal control. These changesinclude, e.g., alterations in glycosylation and/or phosphorylation ofthe BSP or subcellular BSP localization.

[0345] In a preferred embodiment, the expression of a BSNA is measuredby determining the amount of an mRNA that encodes an amino acid sequenceselected from SEQ ID NO: 116 through 218, a homolog, an allelic variant,or a fragment thereof. In a more preferred embodiment, the BSNAexpression that is measured is the level of expression of a BSNA mRNAselected from SEQ ID NO: 1 through 115, or a hybridizing nucleic acid,homologous nucleic acid or allelic variant thereof, or a part of any ofthese nucleic acids. BSNA expression may be measured by any method knownin the art, such as those described supra, including measuring mRNAexpression by Northern blot, quantitative or qualitative reversetranscriptase PCR (RT-PCR), microarray, dot or slot blots or in situhybridization. See, e.g., Ausubel (1992), supra; Ausubel (1999), supra;Sambrook (1989), supra; and Sambrook (2001), supra. BSNA transcriptionmay be measured by any method known in the art including using areporter gene hooked up to the promoter of a BSG of interest or doingnuclear run-off assays. Alterations in mRNA structure, e.g., aberrantsplicing variants, may be determined by any method known in the art,including, RT-PCR followed by sequencing or restriction analysis. Asnecessary, BSNA expression may be compared to a known control, such asnormal breast nucleic acid, to detect a change in expression.

[0346] In another preferred embodiment, the expression of a BSP ismeasured by determining the level of a BSP having an amino acid sequenceselected from the group consisting of SEQ ID NO: 116 through 218, ahomolog, an allelic variant, or a fragment thereof. Such levels arepreferably determined in at least one of cells, tissues, organs and/orbodily fluids, including determination of normal and abnormal levels.Thus, for instance, a diagnostic assay in accordance with the inventionfor diagnosing over- or underexpression of BSNA or BSP compared tonormal control bodily fluids, cells, or tissue samples may be used todiagnose the presence of breast cancer. The expression level of a BSPmay be determined by any method known in the art, such as thosedescribed supra. In a preferred embodiment, the BSP expression level maybe determined by radioimmunoassays, competitive-binding assays, ELISA,Western blot, FACS, immunohistochemistry, immunoprecipitation, proteomicapproaches: two-dimensional gel electrophoresis (2D electrophoresis) andnon-gel-based approaches such as mass spectrometry or proteininteraction profiling. See, e.g, Harlow (1999), supra; Ausubel (1992),supra; and Ausubel (1999), supra. Alterations in the BSP structure maybe determined by any method known in the art, including, e.g., usingantibodies that specifically recognize phosphoserine, phosphothreonineor phosphotyrosine residues, two-dimensional polyacrylamide gelelectrophoresis (2D PAGE) and/or chemical analysis of amino acidresidues of the protein. Id.

[0347] In a preferred embodiment, a radioimmunoassay (RIA) or an ELISAis used. An antibody specific to a BSP is prepared if one is not alreadyavailable. In a preferred embodiment, the antibody is a monoclonalantibody. The anti-BSP antibody is bound to a solid support and any freeprotein binding sites on the solid support are blocked with a proteinsuch as bovine serum albumin. A sample of interest is incubated with theantibody on the solid support under conditions in which the BSP willbind to the anti-BSP antibody. The sample is removed, the solid supportis washed to remove unbound material, and an anti-BSP antibody that islinked to a detectable reagent (a radioactive substance for RIA and anenzyme for ELISA) is added to the solid support and incubated underconditions in which binding of the BSP to the labeled antibody willoccur. After binding, the unbound labeled antibody is removed bywashing. For an ELISA, one or more substrates are added to produce acolored reaction product that is based upon the amount of a BSP in thesample. For an RIA, the solid support is counted for radioactive decaysignals by any method known in the art. Quantitative results for bothRIA and ELISA typically are obtained by reference to a standard curve.

[0348] Other methods to measure BSP levels are known in the art. Forinstance, a competition assay may be employed wherein an anti-BSPantibody is attached to a solid support and an allocated amount of alabeled BSP and a sample of interest are incubated with the solidsupport. The amount of labeled BSP detected which is attached to thesolid support can be correlated to the quantity of a BSP in the sample.

[0349] Of the proteomic approaches, 2D PAGE is a well-known technique.Isolation of individual proteins from a sample such as serum isaccomplished using sequential separation of proteins by isoelectricpoint and molecular weight. Typically, polypeptides are first separatedby isoelectric point (the first dimension) and then separated by sizeusing an electric current (the second dimension). In general, the seconddimension is perpendicular to the first dimension. Because no twoproteins with different sequences are identical on the basis of bothsize and charge, the result of 2D PAGE is a roughly square gel in whicheach protein occupies a unique spot. Analysis of the spots with chemicalor antibody probes, or subsequent protein microsequencing can reveal therelative abundance of a given protein and the identity of the proteinsin the sample.

[0350] Expression levels of a BSNA can be determined by any method knownin the art, including PCR and other nucleic acid methods, such as ligasechain reaction (LCR) and nucleic acid sequence based amplification(NASBA), can be used to detect malignant cells for diagnosis andmonitoring of various malignancies. For example, reverse-transcriptasePCR (RT-PCR) is a powerful technique which can be used to detect thepresence of a specific mRNA population in a complex mixture of thousandsof other mRNA species. In RT-PCR, an mRNA species is first reversetranscribed to complementary DNA (cDNA) with use of the enzyme reversetranscriptase; the cDNA is then amplified as in a standard PCR reaction.

[0351] Hybridization to specific DNA molecules (e.g., oligonucleotides)arrayed on a solid support can be used to both detect the expression ofand quantitate the level of expression of one or more BSNAs of interest.In this approach, all or a portion of one or more BSNAs is fixed to asubstrate. A sample of interest, which may comprise RNA, e.g., total RNAor polyA-selected mRNA, or a complementary DNA (cDNA) copy of the RNA isincubated with the solid support under conditions in which hybridizationwill occur between the DNA on the solid support and the nucleic acidmolecules in the sample of interest. Hybridization between thesubstrate-bound DNA and the nucleic acid molecules in the sample can bedetected and quantitated by several means, including, withoutlimitation, radioactive labeling or fluorescent labeling of the nucleicacid molecule or a secondary molecule designed to detect the hybrid.

[0352] The above tests can be carried out on samples derived from avariety of cells, bodily fluids and/or tissue extracts such ashomogenates or solubilized tissue obtained from a patient. Tissueextracts are obtained routinely from tissue biopsy and autopsy material.Bodily fluids useful in the present invention include blood, urine,saliva or any other bodily secretion or derivative thereof. By blood itis meant to include whole blood, plasma, serum or any derivative ofblood. In a preferred embodiment, the specimen tested for expression ofBSNA or BSP includes, without limitation, breast tissue, fluid obtainedby bronchial alveolar lavage (BAL), sputum, breast cells grown in cellculture, blood, serum, lymph node tissue and lymphatic fluid. In anotherpreferred embodiment, especially when metastasis of a primary breastcancer is known or suspected, specimens include, without limitation,tissues from brain, bone, bone marrow, liver, adrenal glands and colon.In general, the tissues may be sampled by biopsy, including, withoutlimitation, needle biopsy, e.g., transthoracic needle aspiration,cervical mediatinoscopy, endoscopic lymph node biopsy, video-assistedthoracoscopy, exploratory thoracotomy, bone marrow biopsy and bonemarrow aspiration. See Scott, supra and Franklin, pp. 529-570, in Kane,supra. For early and inexpensive detection, assaying for changes inBSNAs or BSPs in cells in sputum samples may be particularly useful.Methods of obtaining and analyzing sputum samples is disclosed inFranklin, supra.

[0353] All the methods of the present invention may optionally includedetermining the expression levels of one or more other cancer markers inaddition to determining the expression level of a BSNA or BSP. In manycases, the use of another cancer marker will decrease the likelihood offalse positives or false negatives. In one embodiment, the one or moreother cancer markers include other BSNA or BSPs as disclosed herein.Other cancer markers useful in the present invention will depend on thecancer being tested and are known to those of skill in the art. In apreferred embodiment, at least one other cancer marker in addition to aparticular BSNA or BSP is measured. In a more preferred embodiment, atleast two other additional cancer markers are used. In an even morepreferred embodiment, at least three, more preferably at least five,even more preferably at least ten additional cancer markers are used.

[0354] Diagnosing

[0355] In one aspect, the invention provides a method for determiningthe expression levels and/or structural alterations of one or more BSNAsand/or BSPs in a sample from a patient suspected of having breastcancer. In general, the method comprises the steps of obtaining thesample from the patient, determining the expression level or structuralalterations of a BSNA and/or BSP and then ascertaining whether thepatient has breast cancer from the expression level of the BSNA or BSP.In general, if high expression relative to a control of a BSNA or BSP isindicative of breast cancer, a diagnostic assay is considered positiveif the level of expression of the BSNA or BSP is at least two timeshigher, and more preferably are at least five times higher, even morepreferably at least ten times higher, than in preferably the same cells,tissues or bodily fluid of a normal human control. In contrast, if lowexpression relative to a control of a BSNA or BSP is indicative ofbreast cancer, a diagnostic assay is considered positive if the level ofexpression of the BSNA or BSP is at least two times lower, morepreferably are at least five times lower, even more preferably at leastten times lower than in preferably the same cells, tissues or bodilyfluid of a normal human control. The normal human control may be from adifferent patient or from uninvolved tissue of the same patient.

[0356] The present invention also provides a method of determiningwhether breast cancer has metastasized in a patient. One may identifywhether the breast cancer has metastasized by measuring the expressionlevels and/or structural alterations of one or more BSNAs and/or BSPs ina variety of tissues. The presence of a BSNA or BSP in a certain tissueat levels higher than that of corresponding noncancerous tissue (e.g.,the same tissue from another individual) is indicative of metastasis ifhigh level expression of a BSNA or BSP is associated with breast cancer.Similarly, the presence of a BSNA or BSP in a tissue at levels lowerthan that of corresponding noncancerous tissue is indicative ofmetastasis if low level expression of a BSNA or BSP is associated withbreast cancer. Further, the presence of a structurally altered BSNA orBSP that is associated with breast cancer is also indicative ofmetastasis.

[0357] In general, if high expression relative to a control of a BSNA orBSP is indicative of metastasis, an assay for metastasis is consideredpositive if the level of expression of the BSNA or BSP is at least twotimes higher, and more preferably are at least five times higher, evenmore preferably at least ten times higher, than in preferably the samecells, tissues or bodily fluid of a normal human control. In contrast,if low expression relative to a control of a BSNA or BSP is indicativeof metastasis, an assay for metastasis is considered positive if thelevel of expression of the BSNA or BSP is at least two times lower, morepreferably are at least five times lower, even more preferably at leastten times lower than in preferably the same cells, tissues or bodilyfluid of a normal human control.

[0358] The BSNA or BSP of this invention may be used as element in anarray or a multi-analyte test to recognize expression patternsassociated with breast cancers or other breast related disorders. Inaddition, the sequences of either the nucleic acids or proteins may beused as elements in a computer program for pattern recognition of breastdisorders.

[0359] Staging

[0360] The invention also provides a method of staging breast cancer ina human patient. The method comprises identifying a human patient havingbreast cancer and analyzing cells, tissues or bodily fluids from suchhuman patient for expression levels and/or structural alterations of oneor more BSNAs or BSPs. First, one or more tumors from a variety ofpatients are staged according to procedures well-known in the art, andthe expression level of one or more BSNAs or BSPs is determined for eachstage to obtain a standard expression level for each BSNA and BSP. Then,the BSNA or BSP expression levels are determined in a biological samplefrom a patient whose stage of cancer is not known. The BSNA or BSPexpression levels from the patient are then compared to the standardexpression level. By comparing the expression level of the BSNAs andBSPs from the patient to the standard expression levels, one maydetermine the stage of the tumor. The same procedure may be followedusing structural alterations of a BSNA or BSP to determine the stage ofa breast cancer.

[0361] Monitoring

[0362] Further provided is a method of monitoring breast cancer in ahuman patient. One may monitor a human patient to determine whetherthere has been metastasis and, if there has been, when metastasis beganto occur. One may also monitor a human patient to determine whether apreneoplastic lesion has become cancerous. One may also monitor a humanpatient to determine whether a therapy, e.g., chemotherapy, radiotherapyor surgery, has decreased or eliminated the breast cancer. The methodcomprises identifying a human patient that one wants to monitor forbreast cancer, periodically analyzing cells, tissues or bodily fluidsfrom such human patient for expression levels of one or more BSNAs orBSPs, and comparing the BSNA or BSP levels over time to those BSNA orBSP expression levels obtained previously. Patients may also bemonitored by measuring one or more structural alterations in a BSNA orBSP that are associated with breast cancer.

[0363] If increased expression of a BSNA or BSP is associated withmetastasis, treatment failure, or conversion of a preneoplastic lesionto a cancerous lesion, then detecting an increase in the expressionlevel of a BSNA or BSP indicates that the tumor is metastasizing, thattreatment has failed or that the lesion is cancerous, respectively. Onehaving ordinary skill in the art would recognize that if this were thecase, then a decreased expression level would be indicative of nometastasis, effective therapy or failure to progress to a neoplasticlesion. If decreased expression of a BSNA or BSP is associated withmetastasis, treatment failure, or conversion of a preneoplastic lesionto a cancerous lesion, then detecting an decrease in the expressionlevel of a BSNA or BSP indicates that the tumor is metastasizing, thattreatment has failed or that the lesion is cancerous, respectively. In apreferred embodiment, the levels of BSNAs or BSPs are determined fromthe same cell type, tissue or bodily fluid as prior patient samples.Monitoring a patient for onset of breast cancer metastasis is periodicand preferably is done on a quarterly basis, but may be done more orless frequently.

[0364] The methods described herein can further be utilized asprognostic assays to identify subjects having or at risk of developing adisease or disorder associated with increased or decreased expressionlevels of a BSNA and/or BSP. The present invention provides a method inwhich a test sample is obtained from a human patient and one or moreBSNAs and/or BSPs are detected. The presence of higher (or lower) BSNAor BSP levels as compared to normal human controls is diagnostic for thehuman patient being at risk for developing cancer, particularly breastcancer. The effectiveness of therapeutic agents to decrease (orincrease) expression or activity of one or more BSNAs and/or BSPs of theinvention can also be monitored by analyzing levels of expression of theBSNAs and/or BSPs in a human patient in clinical trials or in in vitroscreening assays such as in human cells. In this way, the geneexpression pattern can serve as a marker, indicative of thephysiological response of the human patient or cells, as the case maybe, to the agent being tested. ps Detection of Genetic Lesions orMutations

[0365] The methods of the present invention can also be used to detectgenetic lesions or mutations in a BSG, thereby determining if a humanwith the genetic lesion is susceptible to developing breast cancer or todetermine what genetic lesions are responsible, or are partlyresponsible, for a person's existing breast cancer. Genetic lesions canbe detected, for example, by ascertaining the existence of a deletion,insertion and/or substitution of one or more nucleotides from the BSGsof this invention, a chromosomal rearrangement of BSG, an aberrantmodification of BSG (such as of the methylation pattern of the genomicDNA), or allelic loss of a BSG. Methods to detect such lesions in theBSG of this invention are known to those having ordinary skill in theart following the teachings of the specification.

[0366] Methods of Detecting Noncancerous Breast Diseases

[0367] The invention also provides a method for determining theexpression levels and/or structural alterations of one or more BSNAsand/or BSPs in a sample from a patient suspected of having or known tohave a noncancerous breast disease. In general, the method comprises thesteps of obtaining a sample from the patient, determining the expressionlevel or structural alterations of a BSNA and/or BSP, comparing theexpression level or structural alteration of the BSNA or BSP to a normalbreast control, and then ascertaining whether the patient has anoncancerous breast disease. In general, if high expression relative toa control of a BSNA or BSP is indicative of a particular noncancerousbreast disease, a diagnostic assay is considered positive if the levelof expression of the BSNA or BSP is at least two times higher, and morepreferably are at least five times higher, even more preferably at leastten times higher, than in preferably the same cells, tissues or bodilyfluid of a normal human control. In contrast, if low expression relativeto a control of a BSNA or BSP is indicative of a noncancerous breastdisease, a diagnostic assay is considered positive if the level ofexpression of the BSNA or BSP is at least two times lower, morepreferably are at least five times lower, even more preferably at leastten times lower than in preferably the same cells, tissues or bodilyfluid of a normal human control. The normal human control may be from adifferent patient or from uninvolved tissue of the same patient.

[0368] One having ordinary skill in the art may determine whether a BSNAand/or BSP is associated with a particular noncancerous breast diseaseby obtaining breast tissue from a patient having a noncancerous breastdisease of interest and determining which BSNAs and/or BSPs areexpressed in the tissue at either a higher or a lower level than innormal breast tissue. In another embodiment, one may determine whether aBSNA or BSP exhibits structural alterations in a particular noncancerousbreast disease state by obtaining breast tissue from a patient having anoncancerous breast disease of interest and determining the structuralalterations in one or more BSNAs and/or BSPs relative to normal breasttissue.

[0369] Methods for Identifying Breast Tissue

[0370] In another aspect, the invention provides methods for identifyingbreast tissue. These methods are particularly useful in, e.g., forensicscience, breast cell differentiation and development, and in tissueengineering.

[0371] In one embodiment, the invention provides a method fordetermining whether a sample is breast tissue or has breast tissue-likecharacteristics. The method comprises the steps of providing a samplesuspected of comprising breast tissue or having breast tissue-likecharacteristics, determining whether the sample expresses one or moreBSNAs and/or BSPs, and, if the sample expresses one or more BSNAs and/orBSPs, concluding that the sample comprises breast tissue. In a preferredembodiment, the BSNA encodes a polypeptide having an amino acid sequenceselected from SEQ ID NO: 116 through 218, or a homolog, allelic variantor fragment thereof. In a more preferred embodiment, the BSNA has anucleotide sequence selected from SEQ ID NO: 1 through 115, or ahybridizing nucleic acid, an allelic variant or a part thereof.Determining whether a sample expresses a BSNA can be accomplished by anymethod known in the art. Preferred methods include hybridization tomicroarrays, Northern blot hybridization, and quantitative orqualitative RT-PCR. In another preferred embodiment, the method can bepracticed by determining whether a BSP is expressed. Determining whethera sample expresses a BSP can be accomplished by any method known in theart. Preferred methods include Western blot, ELISA, RIA and 2D PAGE. Inone embodiment, the BSP has an amino acid sequence selected from SEQ IDNO: 116 through 218, or a homolog, allelic variant or fragment thereof.In another preferred embodiment, the expression of at least two BSNAsand/or BSPs is determined. In a more preferred embodiment, theexpression of at least three, more preferably four and even morepreferably five BSNAs and/or BSPs are determined.

[0372] In one embodiment, the method can be used to determine whether anunknown tissue is breast tissue. This is particularly useful in forensicscience, in which small, damaged pieces of tissues that are notidentifiable by microscopic or other means are recovered from a crime oraccident scene. In another embodiment, the method can be used todetermine whether a tissue is differentiating or developing into breasttissue. This is important in monitoring the effects of the addition ofvarious agents to cell or tissue culture, e.g., in producing new breasttissue by tissue engineering. These agents include, e.g., growth anddifferentiation factors, extracellular matrix proteins and culturemedium. Other factors that may be measured for effects on tissuedevelopment and differentiation include gene transfer into the cells ortissues, alterations in pH, aqueous:air interface and various otherculture conditions.

[0373] Methods for Producing and Modifying Breast Tissue

[0374] In another aspect, the invention provides methods for producingengineered breast tissue or cells. In one embodiment, the methodcomprises the steps of providing cells, introducing a BSNA or a BSG intothe cells, and growing the cells under conditions in which they exhibitone or more properties of breast tissue cells. In a preferredembodiment, the cells are pluripotent. As is well-known in the art,normal breast tissue comprises a large number of different cell types.Thus, in one embodiment, the engineered breast tissue or cells comprisesone of these cell types. In another embodiment, the engineered breasttissue or cells comprises more than one breast cell type. Further, theculture conditions of the cells or tissue may require manipulation inorder to achieve full differentiation and development of the breast celltissue. Methods for manipulating culture conditions are well-known inthe art.

[0375] Nucleic acid molecules encoding one or more BSPs are introducedinto cells, preferably pluripotent cells. In a preferred embodiment, thenucleic acid molecules encode BSPs having amino acid sequences selectedfrom SEQ ID NO: 116 through 218, or homologous proteins, analogs,allelic variants or fragments thereof. In a more preferred embodiment,the nucleic acid molecules have a nucleotide sequence selected from SEQID NO: 1 through 115, or hybridizing nucleic acids, allelic variants orparts thereof. In another highly preferred embodiment, a BSG isintroduced into the cells. Expression vectors and methods of introducingnucleic acid molecules into cells are well-known in the art and aredescribed in detail, supra.

[0376] Artificial breast tissue may be used to treat patients who havelost some or all of their breast function.

[0377] Pharmaceutical Compositions

[0378] In another aspect, the invention provides pharmaceuticalcompositions comprising the nucleic acid molecules, polypeptides,antibodies, antibody derivatives, antibody fragments, agonists,antagonists, and inhibitors of the present invention. In a preferredembodiment, the pharmaceutical composition comprises a BSNA or partthereof. In a more preferred embodiment, the BSNA has a nucleotidesequence selected from the group consisting of SEQ ID NO: 1 through 115,a nucleic acid that hybridizes thereto, an allelic variant thereof, or anucleic acid that has substantial sequence identity thereto. In anotherpreferred embodiment, the pharmaceutical composition comprises a BSP orfragment thereof. In a more preferred embodiment, the BSP having anamino acid sequence that is selected from the group consisting of SEQ IDNO: 116 through 218, a polypeptide that is homologous thereto, a fusionprotein comprising all or a portion of the polypeptide, or an analog orderivative thereof. In another preferred embodiment, the pharmaceuticalcomposition comprises an anti-BSP antibody, preferably an antibody thatspecifically binds to a BSP having an amino acid that is selected fromthe group consisting of SEQ ID NO: 116 through 218, or an antibody thatbinds to a polypeptide that is homologous thereto, a fusion proteincomprising all or a portion of the polypeptide, or an analog orderivative thereof.

[0379] Such a composition typically contains from about 0.1 to 90% byweight of a therapeutic agent of the invention formulated in and/or witha pharmaceutically acceptable carrier or excipient.

[0380] Pharmaceutical formulation is a well-established art, and isfurther described in Gennaro (ed.), Remington: The Science and Practiceof Pharmacy, 20^(th) ed., Lippincott, Williams & Wilkins (2000); Anselet al., Pharmaceutical Dosage Forms and Drug Delivery Systems, 7^(th)ed., Lippincott Williams & Wilkins (1999); and Kibbe (ed.), Handbook ofPharmaceutical Excipients American Pharmaceutical Association, 3^(rd)ed. (2000), the disclosures of which are incorporated herein byreference in their entireties, and thus need not be described in detailherein.

[0381] Briefly, formulation of the pharmaceutical compositions of thepresent invention ii and will depend upon the route chosen foradministration. The pharmaceutical compositions utilized in thisinvention can be administered by various routes including both enteraland parenteral routes, including oral, intravenous, intramuscular,subcutaneous, inhalation, topical, sublingual, rectal, intra-arterial,intramedullary, intrathecal, intraventricular, transmucosal,transdermal, intranasal, intraperitoneal, intrapulmonary, andintrauterine.

[0382] Oral dosage forms can be formulated as tablets, pills, dragees,capsules, liquids, gels, syrups, slurries, suspensions, and the like,for ingestion by the patient.

[0383] Solid formulations of the compositions for oral administrationcan contain suitable carriers or excipients, such as carbohydrate orprotein fillers, such as sugars, including lactose, sucrose, mannitol,or sorbitol; starch from corn, wheat, rice, potato, or other plants;cellulose, such as methyl cellulose, hydroxypropylmethyl-cellulose,sodium carboxymethylcellulose, or microcrystalline cellulose; gumsincluding arabic and tragacanth; proteins such as gelatin and collagen;inorganics, such as kaolin, calcium carbonate, dicalcium phosphate,sodium chloride; and other agents such as acacia and alginic acid.

[0384] Agents that facilitate disintegration and/or solubilization canbe added, such as the cross-linked polyvinyl pyrrolidone, agar, alginicacid, or a salt thereof, such as sodium alginate, microcrystallinecellulose, corn starch, sodium starch glycolate, and alginic acid.

[0385] Tablet binders that can be used include acacia, methylcellulose,sodium carboxymethylcellulose, polyvinylpyrrolidone (Povidone™),hydroxypropyl methylcellulose, sucrose, starch and ethylcellulose.

[0386] Lubricants that can be used include magnesium stearates, stearicacid, silicone fluid, talc, waxes, oils, and colloidal silica.

[0387] Fillers, agents that facilitate disintegration and/orsolubilization, tablet binders and lubricants, including theaforementioned, can be used singly or in combination.

[0388] Solid oral dosage forms need not be uniform throughout. Forexample, dragee cores can be used in conjunction with suitable coatings,such as concentrated sugar solutions, which can also contain gum arabic,talc, polyvinylpyrrolidone, carbopol gel, polyethylene glycol, and/ortitanium dioxide, lacquer solutions, and suitable organic solvents orsolvent mixtures.

[0389] Oral dosage forms of the present invention include push-fitcapsules made of gelatin, as well as soft, sealed capsules made ofgelatin and a coating, such as glycerol or sorbitol. Push-fit capsulescan contain active ingredients mixed with a filler or binders, such aslactose or starches, lubricants, such as talc or magnesium stearate,and, optionally, stabilizers. In soft capsules, the active compounds canbe dissolved or suspended in suitable liquids, such as fatty oils,liquid, or liquid polyethylene glycol with or without stabilizers.

[0390] Additionally, dyestuffs or pigments can be added to the tabletsor dragee coatings for product identification or to characterize thequantity of active compound, i.e., dosage.

[0391] Liquid formulations of the pharmaceutical compositions for oral(enteral) administration are prepared in water or other aqueous vehiclesand can contain various suspending agents such as methylcellulose,alginates, tragacanth, pectin, kelgin, carrageenan, acacia,polyvinylpyrrolidone, and polyvinyl alcohol. The liquid formulations canalso include solutions, emulsions, syrups and elixirs containing,together with the active compound(s), wetting agents, sweeteners, andcoloring and flavoring agents.

[0392] The pharmaceutical compositions of the present invention can alsobe formulated for parenteral administration. Formulations for parenteraladministration can be in the form of aqueous or non-aqueous isotonicsterile injection solutions or suspensions.

[0393] For intravenous injection, water soluble versions of thecompounds of the present invention are formulated in, or if provided asa lyophilate, mixed with, a physiologically acceptable fluid vehicle,such as 5% dextrose (“D5”), physiologically buffered saline, 0.9%saline, Hanks' solution, or Ringer's solution. Intravenous formulationsmay include carriers, excipients or stabilizers including, withoutlimitation, calcium, human serum albumin, citrate, acetate, calciumchloride, carbonate, and other salts.

[0394] Intramuscular preparations, e.g. a sterile formulation of asuitable soluble salt form of the compounds of the present invention,can be dissolved and administered in a pharmaceutical excipient such asWater-for-Injection, 0.9% saline, or 5% glucose solution. Alternatively,a suitable insoluble form of the compound can be prepared andadministered as a suspension in an aqueous base or a pharmaceuticallyacceptable oil base, such as an ester of a long chain fatty acid (e.g.,ethyl oleate), fatty oils such as sesame oil, triglycerides, orliposomes.

[0395] Parenteral formulations of the compositions can contain variouscarriers such as vegetable oils, dimethylacetamide, dimethylformamide,ethyl lactate, ethyl carbonate, isopropyl myristate, ethanol, polyols(glycerol, propylene glycol, liquid polyethylene glycol, and the like).

[0396] Aqueous injection suspensions can also contain substances thatincrease the viscosity of the suspension, such as sodium carboxymethylcellulose, sorbitol, or dextran. Non-lipid polycationic amino polymerscan also be used for delivery. Optionally, the suspension can alsocontain suitable stabilizers or agents that increase the solubility ofthe compounds to allow for the preparation of highly concentratedsolutions.

[0397] Pharmaceutical compositions of the present invention can also beformulated to permit injectable, long-term, deposition. Injectable depotforms may be made by forming microencapsulated matrices of the compoundin biodegradable polymers such as polylactide-polyglycolide. Dependingupon the ratio of drug to polymer and the nature of the particularpolymer employed, the rate of drug release can be controlled. Examplesof other biodegradable polymers include poly(orthoesters) andpoly(anhydrides). Depot injectable formulations are also prepared byentrapping the drug in microemulsions that are compatible with bodytissues.

[0398] The pharmaceutical compositions of the present invention can beadministered topically.

[0399] For topical use the compounds of the present invention can alsobe prepared in suitable forms to be applied to the skin, or mucusmembranes of the nose and throat, and can take the form of lotions,creams, ointments, liquid sprays or inhalants, drops, tinctures,lozenges, or throat paints. Such topical formulations further caninclude chemical compounds such as dimethylsulfoxide (DMSO) tofacilitate surface penetration of the active ingredient. In othertransdermal formulations, typically in patch-delivered formulations, thepharmaceutically active compound is formulated with one or more skinpenetrants, such as 2-N-methyl-pyrrolidone (NMP) or Azone. A topicalsemi-solid ointment formulation typically contains a concentration ofthe active ingredient from about 1 to 20%, e.g., 5 to 10%, in a carriersuch as a pharmaceutical cream base.

[0400] For application to the eyes or ears, the compounds of the presentinvention can be presented in liquid or semi-liquid form formulated inhydrophobic or hydrophilic bases as ointments, creams, lotions, paintsor powders.

[0401] For rectal administration the compounds of the present inventioncan be administered in the form of suppositories admixed withconventional carriers such as cocoa butter, wax or other glyceride.

[0402] Inhalation formulations can also readily be formulated. Forinhalation, various powder and liquid formulations can be prepared. Foraerosol preparations, a sterile formulation of the compound or salt formof the compound may be used in inhalers, such as metered dose inhalers,and nebulizers. Aerosolized forms may be especially useful for treatingrespiratory disorders.

[0403] Alternatively, the compounds of the present invention can be inpowder form for reconstitution in the appropriate pharmaceuticallyacceptable carrier at the time of delivery.

[0404] The pharmaceutically active compound in the pharmaceuticalcompositions of the present invention can be provided as the salt of avariety of acids, including but not limited to hydrochloric, sulfuric,acetic, lactic, tartaric, malic, and succinic acid. Salts tend to bemore soluble in aqueous or other protonic solvents than are thecorresponding free base forms.

[0405] After pharmaceutical compositions have been prepared, they arepackaged in an appropriate container and labeled for treatment of anindicated condition.

[0406] The active compound will be present in an amount effective toachieve the intended purpose. The determination of an effective dose iswell within the capability of those skilled in the art.

[0407] A “therapeutically effective dose” refers to that amount ofactive ingredient, for example BSP polypeptide, fusion protein, orfragments thereof, antibodies specific for BSP, agonists, antagonists orinhibitors of BSP, which ameliorates the signs or symptoms of thedisease or prevents progression thereof; as would be understood in themedical arts, cure, although desired, is not required.

[0408] The therapeutically effective dose of the pharmaceutical agentsof the present invention can be estimated initially by in vitro tests,such as cell culture assays, followed by assay in model animals, usuallymice, rats, rabbits, dogs, or pigs. The animal model can also be used todetermine an initial preferred concentration range and route ofadministration.

[0409] For example, the ED50 (the dose therapeutically effective in 50%of the population) and LD50 (the dose lethal to 50% of the population)can be determined in one or more cell culture of animal model systems.The dose ratio of toxic to therapeutic effects is the therapeutic index,which can be expressed as LD50/ED50. Pharmaceutical compositions thatexhibit large therapeutic indices are preferred.

[0410] The data obtained from cell culture assays and animal studies areused in formulating an initial dosage range for human use, andpreferably provide a range of circulating concentrations that includesthe ED50 with little or no toxicity. After administration, or betweensuccessive administrations, the circulating concentration of activeagent varies within this range depending upon pharmacokinetic factorswell-known in the art, such as the dosage form employed, sensitivity ofthe patient, and the route of administration.

[0411] The exact dosage will be determined by the practitioner, in lightof factors specific to the subject requiring treatment. Factors that canbe taken into account by the practitioner include the severity of thedisease state, general health of the subject, age, weight, gender of thesubject, diet, time and frequency of administration, drugcombination(s), reaction sensitivities, and tolerance/response totherapy. Long-acting pharmaceutical compositions can be administeredevery 3 to 4 days, every week, or once every two weeks depending onhalf-life and clearance rate of the particular formulation.

[0412] Normal dosage amounts may vary from 0.1 to 100,000 micrograms, upto a total dose of about 1 g, depending upon the route ofadministration. Where the therapeutic agent is a protein or antibody ofthe present invention, the therapeutic protein or antibody agenttypically is administered at a daily dosage of 0.01 mg to 30 mg/kg ofbody weight of the patient (e.g., 1 mg/kg to 5 mg/kg). Thepharmaceutical formulation can be administered in multiple doses perday, if desired, to achieve the total desired daily dose.

[0413] Guidance as to particular dosages and methods of delivery isprovided in the literature and generally available to practitioners inthe art. Those skilled in the art will employ different formulations fornucleotides than for proteins or their inhibitors. Similarly, deliveryof polynucleotides or polypeptides will be specific to particular cells,conditions, locations, etc.

[0414] Conventional methods, known to those of ordinary skill in the artof medicine, can be used to administer the pharmaceutical formulation(s)of the present invention to the patient. The pharmaceutical compositionsof the present invention can be administered alone, or in combinationwith other therapeutic agents or interventions.

[0415] Therapeutic Methods

[0416] The present invention further provides methods of treatingsubjects having defects in a gene of the invention, e.g., in expression,activity, distribution, localization, and/or solubility, which canmanifest as a disorder of breast function. As used herein, “treating”includes all medically-acceptable types of therapeutic intervention,including palliation and prophylaxis (prevention) of disease. The term“treating” encompasses any improvement of a disease, including minorimprovements. These methods are discussed below.

[0417] Gene Therapy and Vaccines

[0418] The isolated nucleic acids of the present invention can also beused to drive in vivo expression of the polypeptides of the presentinvention. In vivo expression can be driven from a vector, typically aviral vector, often a vector based upon a replication incompetentretrovirus, an adenovirus, or an adeno-associated virus (AAV), forpurpose of gene therapy. In vivo expression can also be driven fromsignals endogenous to the nucleic acid or from a vector, often a plasmidvector, such as pVAX1 (Invitrogen, Carlsbad, Calif., USA), for purposeof “naked” nucleic acid vaccination, as further described in U.S. Pat.Nos. 5,589,466; 5,679,647; 5,804,566; 5,830,877; 5,843,913; 5,880,104;5,958,891; 5,985,847; 6,017,897; 6,110,898; and 6,204,250, thedisclosures of which are incorporated herein by reference in theirentireties. For cancer therapy, it is preferred that the vector also betumor-selective. See, e.g., Doronin et al., J. Virol. 75: 3314-24(2001).

[0419] In another embodiment of the therapeutic methods of the presentinvention, a therapeutically effective amount of a pharmaceuticalcomposition comprising a nucleic acid of the present invention isadministered. The nucleic acid can be delivered in a vector that drivesexpression of a BSP, fusion protein, or fragment thereof, or withoutsuch vector. Nucleic acid compositions that can drive expression of aBSP are administered, for example, to complement a deficiency in thenative BSP, or as DNA vaccines. Expression vectors derived from virus,replication deficient retroviruses, adenovirus, adeno-associated (AAV)virus, herpes virus, or vaccinia virus can be used as can plasmids. See,e.g., Cid-Arregui, supra. In a preferred embodiment, the nucleic acidmolecule encodes a BSP having the amino acid sequence of SEQ ID NO: 116through 218, or a fragment, fusion protein, allelic variant or homologthereof.

[0420] In still other therapeutic methods of the present invention,pharmaceutical compositions comprising host cells that express a BSP,fusions, or fragments thereof can be administered. In such cases, thecells are typically autologous, so as to circumvent xenogeneic orallotypic rejection, and are administered to complement defects in BSPproduction or activity. In a preferred embodiment, the nucleic acidmolecules in the cells encode a BSP having the amino acid sequence ofSEQ ID NO: 116 through 218, or a fragment, fusion protein, allelicvariant or homolog thereof.

[0421] Antisense Administration

[0422] Antisense nucleic acid compositions, or vectors that driveexpression of a BSG antisense nucleic acid, are administered todownregulate transcription and/or translation of a BSG in circumstancesin which excessive production, or production of aberrant protein, is thepathophysiologic basis of disease.

[0423] Antisense compositions useful in therapy can have a sequence thatis complementary to coding or to noncoding regions of a BSG. Forexample, oligonucleotides derived from the transcription initiationsite, e.g., between positions −10 and +10 from the start site, arepreferred.

[0424] Catalytic antisense compositions, such as ribozymes, that arecapable of sequence-specific hybridization to BSG transcripts, are alsouseful in therapy. See, e.g., Phylactou, Adv. Drug Deliv. Rev. 44(2-3):97-108 (2000); Phylactou et al., Hum. Mol. Genet. 7(10): 1649-53 (1998);Rossi, Ciba Found. Symp. 209: 195-204 (1997); and Sigurdsson et al.,Trends Biotechnol. 13(8): 286-9 (1995), the disclosures of which areincorporated herein by reference in their entireties.

[0425] Other nucleic acids useful in the therapeutic methods of thepresent invention are those that are capable of triplex helix formationin or near the BSG genomic locus. Such triplexing oligonucleotides areable to inhibit transcription. See, e.g., Intody et al., Nucleic AcidsRes. 28(21): 4283-90 (2000); McGuffie et al., Cancer Res. 60(14): 3790-9(2000), the disclosures of which are incorporated herein by reference.Pharmaceutical compositions comprising such triplex forming oligos(TFOs) are administered in circumstances in which excessive production,or production of aberrant protein, is a pathophysiologic basis ofdisease.

[0426] In a preferred embodiment, the antisense molecule is derived froma nucleic acid molecule encoding a BSP, preferably a BSP comprising anamino acid sequence of SEQ ID NO: 116 through 218, or a fragment,allelic variant or homolog thereof. In a more preferred embodiment, theantisense molecule is derived from a nucleic acid molecule having anucleotide sequence of SEQ ID NO: 1 through 115, or a part, allelicvariant, substantially similar or hybridizing nucleic acid thereof.

[0427] Polypeptide Administration

[0428] In one embodiment of the therapeutic methods of the presentinvention, a therapeutically effective amount of a pharmaceuticalcomposition comprising a BSP, a fusion protein, fragment, analog orderivative thereof is administered to a subject with aclinically-significant BSP defect.

[0429] Protein compositions are administered, for example, to complementa deficiency in native BSP. In other embodiments, protein compositionsare administered as a vaccine to elicit a humoral and/or cellular immuneresponse to BSP. The immune response can be used to modulate activity ofBSP or, depending on the immunogen, to immunize against aberrant oraberrantly expressed forms, such as mutant or inappropriately expressedisoforms. In yet other embodiments, protein fusions having a toxicmoiety are administered to ablate cells that aberrantly accumulate BSP.

[0430] In a preferred embodiment, the polypeptide is a BSP comprising anamino acid sequence of SEQ ID NO: 116 through 218, or a fusion protein,allelic variant, homolog, analog or derivative thereof. In a morepreferred embodiment, the polypeptide is encoded by a nucleic acidmolecule having a nucleotide sequence of SEQ ID NO: 1 through 115, or apart, allelic variant, substantially similar or hybridizing nucleic acidthereof.

[0431] Antibody, Agonist and Antagonist Administration

[0432] In another embodiment of the therapeutic methods of the presentinvention, a therapeutically effective amount of a pharmaceuticalcomposition comprising an antibody (including fragment or derivativethereof) of the present invention is administered. As is well-known,antibody compositions are administered, for example, to antagonizeactivity of BSP, or to target therapeutic agents to sites of BSPpresence and/or accumulation. In a preferred embodiment, the antibodyspecifically binds to a BSP comprising an amino acid sequence of SEQ IDNO: 116 through 218, or a fusion protein, allelic variant, homolog,analog or derivative thereof. In a more preferred embodiment, theantibody specifically binds to a BSP encoded by a nucleic acid moleculehaving a nucleotide sequence of SEQ ID NO: 1 through 115, or a part,allelic variant, substantially similar or hybridizing nucleic acidthereof.

[0433] The present invention also provides methods for identifyingmodulators which bind to a BSP or have a modulatory effect on theexpression or activity of a BSP. Modulators which decrease theexpression or activity of BSP (antagonists) are believed to be useful intreating breast cancer. Such screening assays are known to those ofskill in the art and include, without limitation, cell-based assays andcell-free assays. Small molecules predicted via computer imaging tospecifically bind to regions of a BSP can also be designed, synthesizedand tested for use in the imaging and treatment of breast cancer.Further, libraries of molecules can be screened for potential anticanceragents by assessing the ability of the molecule to bind to the BSPsidentified herein. Molecules identified in the library as being capableof binding to a BSP are key candidates for further evaluation for use inthe treatment of breast cancer. In a preferred embodiment, thesemolecules will downregulate expression and/or activity of a BSP incells.

[0434] In another embodiment of the therapeutic methods of the presentinvention, a pharmaceutical composition comprising a non-antibodyantagonist of BSP is administered. Antagonists of BSP can be producedusing methods generally known in the art. In particular, purified BSPcan be used to screen libraries of pharmaceutical agents, oftencombinatorial libraries of small molecules, to identify those thatspecifically bind and antagonize at least one activity of a BSP.

[0435] In other embodiments a pharmaceutical composition comprising anagonist of a BSP is administered. Agonists can be identified usingmethods analogous to those used to identify antagonists.

[0436] In a preferred embodiment, the antagonist or agonist specificallybinds to and antagonizes or agonizes, respectively, a BSP comprising anamino acid sequence of SEQ ID NO: 116 through 218, or a fusion protein,allelic variant, homolog, analog or derivative thereof. In a morepreferred embodiment, the antagonist or agonist specifically binds toand antagonizes or agonizes, respectively, a BSP encoded by a nucleicacid molecule having a nucleotide sequence of SEQ ID NO: 1 through 115,or a part, allelic variant, substantially similar or hybridizing nucleicacid thereof.

[0437] Targeting Breast Tissue

[0438] The invention also provides a method in which a polypeptide ofthe invention, or an antibody thereto, is linked to a therapeutic agentsuch that it can be delivered to the breast or to specific cells in thebreast. In a preferred embodiment, an anti-BSP antibody is linked to atherapeutic agent and is administered to a patient in need of suchtherapeutic agent. The therapeutic agent may be a toxin, if breasttissue needs to be selectively destroyed. This would be useful fortargeting and killing breast cancer cells. In another embodiment, thetherapeutic agent may be a growth or differentiation factor, which wouldbe useful for promoting breast cell function.

[0439] In another embodiment, an anti-BSP antibody may be linked to animaging agent that can be detected using, e.g., magnetic resonanceimaging, CT or PET. This would be useful for determining and monitoringbreast function, identifying breast cancer tumors, and identifyingnoncancerous breast diseases.

EXAMPLES Example 1 Gene Expression Analysis

[0440] BSGs were identified by a systematic analysis of gene expressiondata in the LIFESEQ® Gold database available from Incyte Genomics Inc(Palo Alto, Calif.) using the data mining software package CLASP™(Candidate Lead Automatic Search Program). CLASP™ is a set of algorithmsthat interrogate Incyte's database to identify genes that are bothspecific to particular tissue types as well as differentially expressedin tissues from patients with cancer. LifeSeq® Gold contains informationabout which genes are expressed in various tissues in the body and aboutthe dynamics of expression in both normal and diseased states. CLASP™first sorts the LifeSeq® Gold database into defined tissue types, suchas breast, ovary and prostate. CLASP™ categorizes each tissue sample bydisease state. Disease states include “healthy,” “cancer,” “associatedwith cancer,” “other disease” and “other.” Categorizing the diseasestates improves our ability to identify tissue and cancer-specificmolecular targets. CLASP™ then performs a simultaneous parallel searchfor genes that are expressed both (1) selectively in the defined tissuetype compared to other tissue types and (2) differentially in the“cancer” disease state compared to the other disease states affectingthe same, or different, tissues. This sorting is accomplished by usingmathematical and statistical filters that specify the minimum change inexpression levels and the minimum frequency that the differentialexpression pattern must be observed across the tissue samples for thegene to be considered statistically significant. The CLASP™ algorithmquantifies the relative abundance of a particular gene in each tissuetype and in each disease state.

[0441] To find the BSGs of this invention, the following specific CLASP™profiles were utilized: tissue-specific expression (CLASP 1), detectableexpression only in cancer tissue (CLASP 2), highest differentialexpression for a given cancer (CLASP 4); differential expression incancer tissue (CLASP 5), and. cDNA libraries were divided into 60 uniquetissue types (early versions of LifeSeq® had 48 tissue types). Genes orESTs were grouped into “gene bins,” where each bin is a cluster ofsequences grouped together where they share a common contig. Theexpression level for each gene bin was calculated for each tissue type.Differential expression significance was calculated with rigorousstatistical significant testing taking into account variations in samplesize and relative gene abundance in different libraries and within eachlibrary (for the equations used to determine statistically significantexpression see Audic and Claverie “The significance of digital geneexpression profiles,” Genome Res 7(10): 986-995 (1997), includingEquation 1 on page 987 and Equation 2 on page 988, the contents of whichare incorporated by reference). Differentially expressed tissue-specificgenes were selected based on the percentage abundance level in thetargeted tissue versus all the other tissues (tissue-specificity). Theexpression levels for each gene in libraries of normal tissues ornon-tumor tissues from cancer patients were compared with the expressionlevels in tissue libraries associated with tumor or disease(cancer-specificity). The results were analyzed for statisticalsignificance.

[0442] The selection of the target genes meeting the rigorous CLASP™profile criteria were as follows:

[0443] (a) CLASP 1: tissue-specific expression: To qualify as a CLASP 1candidate, a gene must exhibit statistically significant expression inthe tissue of interest compared to all other tissues. Only if the geneexhibits such differential expression with a 90% of confidence level isit selected as a CLASP 1 candidate.

[0444] (b) CLASP 2: detectable expression only in cancer tissue: Toqualify as a CLASP 2 candidate, a gene must exhibit detectableexpression in tumor tissues and undetectable expression in librariesfrom normal individuals and libraries from normal tissue obtained fromdiseased patients. In addition, such a gene must also exhibit furtherspecificity for the tumor tissues of interest.

[0445] (c) CLASP 5: differential expression in cancer tissue: To qualifyas a CLASP 5 candidate, a gene must be differentially expressed in tumorlibraries in the tissue of interest compared to normal libraries for alltissues. Only if the gene exhibits such differential expression with a90% of confidence level is it selected as a CLASP 5 candidate.

[0446] The CLASP™ scores for some of sequences found be the mRNAsubtractions are listed below:

[0447] DEX0267_(—)23 Breast 5

[0448] DEX0267_(—)71 Breast 5

[0449] DEX0267_(—)78 Breast 5 and 1

[0450] DEX0267_(—)89 Breast 5

[0451] DEX0267_(—)101 Breast 5

[0452] The CLASP™ expression levels for selected sequences are listedbelow:

[0453] DEX0267_(—)11 SEQ ID NO: 11 BRN .0002 LNG .0011 FAL .0063 ESO.0102

[0454] DEX0267_(—)23 SEQ ID NO: 23 MAM .0179 TST .0011 BLO .0019 SPL.002 GEM .0021

[0455] DEX0267_(—)61 SEQ ID NO: 61 MAM 1.0726 NOS .3813 PLE .4337 PIB.5075 TST .5487

[0456] DEX0267_(—)66 SEQ ID NO: 66 BRN .0002 LNG .0011 PRO .0011

[0457] DEX0267_(—)67 SEQ ID NO: 67 LMN .0028 URE .0112 UNC .016

[0458] DEX0267_(—)71 SEQ ID NO: 71 MAM .0142 UTR .0094 ADR .0179

[0459] DEX0267_(—)73 SEQ ID NO: 73 MAM .0028 UTR .0006 THY .002 OVR.0031 ESO .0051

[0460] DEX0267_(—)76 SEQ ID NO: 76 INS .0076

[0461] DEX0267_(—)78 SEQ ID NO: 78 MAM .0014 FTS .0001 UTR .0004 PRO.0007 CTL .0046

[0462] DEX0267_(—)80 SEQ ID NO: 80 UTR .0006 BLD .0048 FAL .0063 CRD.0068

[0463] DEX0267_(—)89 SEQ ID NO: 89 MAM .0094 SPL .0063 OVR .0092 PNS.0094 PLE .0299

[0464] DEX0267_(—)93 SEQ ID NO: 93 TST .0054

[0465] DEX0267_(—)94 SEQ ID NO: 94 TST .0054

[0466] DEX0267_(—)98 SEQ ID NO: 98 MAM .3287 SAG .079 UNC .1635 PIT.2054 INT .2103

[0467] DEX0267_(—)100 SEQ ID NO: 100 PNS .0164 LMN .0222 OVR .0246 NOS.0587

[0468] DEX0267_(—)101 SEQ ID NO: 101 MAM .0061 STO .0081 FAL .0126 URE.0337

[0469] DEX0267_(—)115 SEQ ID NO: 115MAM .0128 ADR .0015 LIV .0019 SPL.0021CRD .0023

[0470] Abbreviation for tissues:

[0471] ADR Adrenal Glands, BLD Bladder, BLO Blood, BRN Brain, CRD Heart,CTL Cartilage, ESO Esophagus, FAL Fallopian Tubes, FTS Fetus, GEM GermCells, INS Intestine, Small, INT Intestine, LIV Liver, LMN LymphoidTissue, LNG Lung, MAM Breast, NOS Nose, OVR Ovary, PIB Pineal Body, PITPituitary Gland, PLE Pleura, PNS Penis, PRO Prostate, SAG SalivaryGlands, SPL Spleen, STO Stomach, THY Thymus Gland, TST Testis, UNC MixedTissues, URE Ureter, UTR Uterus

[0472] The chromosomal locations for the sequences are as follows:

[0473] DEX0267_(—)2 chromosome 2

[0474] DEX0267_(—)12 chromosome 9

[0475] DEX0267_(—)23 chromosome 4

[0476] DEX0267_(—)31 chromosome 10

[0477] DEX0267_(—)36 chromosome 16

[0478] DEX0267_(—)44 chromosome 10

[0479] DEX0267_(—)72 chromosome 15

[0480] DEX0267_(—)73 chromosome 1

[0481] DEX0267_(—)94 chromosome 2

[0482] DEX0267_(—)96 chromosome 14

[0483] DEX0267_(—)103 chromosome 16

Example 2 Relative Quantitation of Gene Expression

[0484] Real-Time quantitative PCR with fluorescent Taqman probes is aquantitation detection system utilizing the 5′-3′ nuclease activity ofTaq DNA polymerase. The method uses an internal fluorescentoligonucleotide probe (Taqman) labeled with a 5′ reporter dye and adownstream, 3′ quencher dye. During PCR, the 5′-3′ nuclease activity ofTaq DNA polymerase releases the reporter, whose fluorescence can then bedetected by the laser detector of the Model 7700 Sequence DetectionSystem (PE Applied Biosystems, Foster City, Calif., USA). Amplificationof an endogenous control is used to standardize the amount of sample RNAadded to the reaction and normalize for Reverse Transcriptase (RT)efficiency. Either cyclophilin, glyceraldehyde-3-phosphate dehydrogenase(GAPDH), ATPase, or 18S ribosomal RNA (rRNA) is used as this endogenouscontrol. To calculate relative quantitation between all the samplesstudied, the target RNA levels for one sample were used as the basis forcomparative results (calibrator). Quantitation relative to the“calibrator” can be obtained using the standard curve method or thecomparative method (User Bulletin #2: ABI PRISM 7700 Sequence DetectionSystem).

[0485] The tissue distribution and the level of the target gene areevaluated for every sample in normal and cancer tissues. Total RNA isextracted from normal tissues, cancer tissues, and from cancers and thecorresponding matched adjacent tissues. Subsequently, first strand cDNAis prepared with reverse transcriptase and the polymerase chain reactionis done using primers and Taqman probes specific to each target gene.The results are analyzed using the ABI PRISM 7700 Sequence Detector. Theabsolute numbers are relative levels of expression of the target gene ina particular tissue compared to the calibrator tissue.

[0486] One of ordinary skill can design appropriate primers. Therelative levels of expression of the BSNA versus normal tissues andother cancer tissues can then be determined. All the values are comparedto a normal tissue (calibrator). These RNA samples are commerciallyavailable pools, originated by pooling samples of a particular tissuefrom different individuals.

[0487] The relative levels of expression of the BSNA in pairs ofmatching samples and 1 cancer and 1 normal/normal adjacent of tissue mayalso be determined. All the values are compared to a normal tissue(calibrator). A matching pair is formed by mRNA from the cancer samplefor a particular tissue and mRNA from the normal adjacent sample forthat same tissue from the same individual.

[0488] In the analysis of matching samples, BSNAs show a high degree oftissue specificity for the tissue of interest. Results from theseexperiments confirm the tissue specificity results obtained with normalpooled samples.

[0489] Further, the level of mRNA expression in cancer samples and theisogenic normal adjacent tissue from the same individual are compared.This comparison provides an indication of specificity for the cancerstage (e.g. higher levels of mRNA expression in the cancer samplecompared to the normal adjacent).

[0490] Altogether, the high level of tissue specificity, plus the mRNAoverexpression in matching samples tested are indicative of SEQ ID NO: 1through 115 being a diagnostic marker for cancer.

Example 3 Protein Expression

[0491] The BSNA is amplified by polymerase chain reaction (PCR) and theamplified DNA fragment encoding the BSNA is subdloned in pET-21d forexpression in E. coli. In addition to the BSNA coding sequence, codonsfor two amino acids, Met-Ala, flanking the NH₂-terminus of the codingsequence of BSNA, and six histidines, flanking the COOH-terminus of thecoding sequence of BSNA, are incorporated to serve as initiatingMet/restriction site and purification tag, respectively.

[0492] An over-expressed protein band of the appropriate molecularweight may be observed on a Coomassie blue stained polyacrylamide gel.This protein band is confirmed by Western blot analysis using monoclonalantibody against 6×Histidine tag.

[0493] Large-scale purification of BSP was achieved using cell pastegenerated from 6-liter bacterial cultures, and purified usingimmobilized metal affinity chromatography (IMAC). Soluble fractions thathad been separated from total cell lysate were incubated with a nicklechelating resin. The column was packed and washed with five columnvolumes of wash buffer. BSP was eluted stepwise with variousconcentration imidazole buffers.

Example 4 Protein Fusions

[0494] Briefly, the human Fc portion of the IgG molecule can be PCRamplified, using primers that span the 5′ and 3′ ends of the sequencedescribed below. These primers also should have convenient restrictionenzyme sites that will facilitate cloning into an expression vector,preferably a mammalian expression vector. For example, if pC4 (AccessionNo. 209646) is used, the human Fc portion can be ligated into the BamHIcloning site. Note that the 3′ BamHI site should be destroyed. Next, thevector containing the human Fc portion is re-restricted with BamHI,linearizing the vector, and a polynucleotide of the present invention,isolated by the PCR protocol described in Example 2, is ligated intothis BaniHI site. Note that the polynucleotide is cloned without a stopcodon, otherwise a fusion protein will not be produced. If the naturallyoccurring signal sequence is used to produce the secreted protein, pC4does not need a second signal peptide. Alternatively, if the naturallyoccurring signal sequence is not used, the vector can be modified toinclude a heterologous signal sequence. See, e. g., WO 96/34891.

Example 5 Production of an Antibody from a Polypeptide

[0495] In general, such procedures involve immunizing an animal(preferably a mouse) with polypeptide or, more preferably, with asecreted polypeptide-expressing cell. Such cells may be cultured in anysuitable tissue culture medium; however, it is preferable to culturecells in Earle's modified Eagle's medium supplemented with 10% fetalbovine serum (inactivated at about 56° C.), and supplemented with about10 g/l of nonessential amino acids, about 1,000 U/ml of penicillin, andabout 100, μg/ml of streptomycin. The splenocytes of such mice areextracted and fused with a suitable myeloma cell line. Any suitablemyeloma cell line may be employed in accordance with the presentinvention; however, it is preferable to employ the parent myeloma cellline (SP20), available from the ATCC. After fusion, the resultinghybridoma cells are selectively maintained in HAT medium, and thencloned by limiting dilution as described by Wands et al.,Gastroenterology 80: 225-232 (1981).

[0496] The hybridoma cells obtained through such a selection are thenassayed to identify clones which secrete antibodies capable of bindingthe polypeptide. Alternatively, additional antibodies capable of bindingto the polypeptide can be produced in a two-step procedure usinganti-idiotypic antibodies. Such a method makes use of the fact thatantibodies are themselves antigens, and therefore, it is possible toobtain an antibody which binds to a second antibody. In accordance withthis method, protein specific antibodies are used to immunize an animal,preferably a mouse. The splenocytes of such an animal are then used toproduce hybridoma cells, and the hybridoma cells are screened toidentify clones which produce an antibody whose ability to bind to theprotein-specific antibody can be blocked by the polypeptide. Suchantibodies comprise anti-idiotypic antibodies to the protein specificantibody and can be used to immunize an animal to induce formation offurther protein-specific antibodies.

[0497] Using the Jameson-Wolf methods the following epitopes werepredicted. (Jameson and Wolf, CABIOS, 4(1), 181-186, 1988, the contentsof which are incorporated by reference).

[0498] DEX0267_(—)116 Antigenicity Index(Jameson-Wolf)

[0499] positions AI avg length

[0500] 18-28 1.01 11

[0501] DEX0267_(—)118 Antigenicity Index(Jameson-Wolf)

[0502] positions AI avg length

[0503] 12-29 1.01 18

[0504] DEX0267_(—)120 Antigenicity Index(Jameson-Wolf

[0505] positions AI avg length

[0506] 150-162 1.30 13

[0507] 55-65 1.09 11

[0508] 3-51 1.03 49

[0509]101-123 1.03 23

[0510] DEX0267_(—)122 Antigenicity Index(Jameson-Wolf)

[0511] positions AI avg length

[0512] 23-32 1.05 10

[0513] DEX0267_(—)125 Antigenicity Index(Jameson-Wolf)

[0514] positions AI avg length

[0515] 221-233 1.16 13

[0516] 124-142 1.16 19

[0517] 279-289 1.14 11

[0518] 261-271 1.10 11

[0519] DEX0267_(—)129 Antigenicity Index(Jameson-Wolf)

[0520] positions AI avg length

[0521] 7-48 1.13 42

[0522] DEX0267_(—)133 Antigenicity Index(Jameson-Wolf)

[0523] positions AI avg length

[0524] 398-409 1.30 12

[0525] 22-38 1.21 17

[0526] 478-489 1.15 12

[0527] 90-103 1.10 14

[0528] 111-134 1.06 24

[0529] 376-396 1.05 21

[0530] 319-328 1.04 10

[0531] 331-366 1.02 36

[0532] DEX0267_(—)138 Antigenicity Index(Jameson-Wolf)

[0533] positions AI avg length

[0534] 67-77 1.01 11

[0535] DEX0267_(—)140 Antigenicity Index(Jameson-Wolf)

[0536] positions AI avg length

[0537] 30-42 1.17 13

[0538] DEX0267_(—)141 Antigenicity Index(Jameson-Wolf)

[0539] positions AI avg length

[0540] 100-115 1.10 16

[0541] DEX0267_(—)143 Antigenicity Index(Jameson-Wolf)

[0542] positions AI avg length

[0543] 108-118 1.10 11

[0544] 166-216 1.02 51

[0545] DEX0267_(—)144 Antigenicity Index(Jameson-Wolf)

[0546] positions AI avg length

[0547] 17-26 1.06 10

[0548] DEX0267_(—)146 Antigenicity Index(Jameson-Wolf)

[0549] positions AI avg length

[0550] 8-58 1.06 51

[0551] DEX0267_(—)148 Antigenicity Index(Jameson-Wolf)

[0552] positions AI avg length

[0553] 41-56 1.15 16

[0554] DEX0267_(—)153 Antigenicity Index(Jameson-Wolf)

[0555] positions AI avg length

[0556] 39-73 1.13 35

[0557] DEX0267_(—)155 Antigenicity Index(Jameson-Wolf)

[0558] positions AI avg length

[0559] 7-32 1.11 26

[0560]56-71 1.00 16

[0561] DEX0267_(—)156 Antigenicity Index(Jameson-Wolf)

[0562] positions AI avg length

[0563] 7-19 1.06 13

[0564] DEX0267_(—)158 Antigenicity Index(Jameson-Wolf)

[0565] positions AI avg length

[0566] 98-118 1.00 21

[0567] DEX0267_(—)167 Antigenicity Index(Jameson-Wolf)

[0568] positions AI avg length 17-28 1.14 12

[0569] DEX0267_(—)170 Antigenicity Index(Jameson-Wolf)

[0570] positions AI avg length

[0571] 55-68 1.36 14

[0572]18-43 1.12 26

[0573] DEX0267_(—)171 Antigenicity Index(Jameson-Wolf)

[0574] positions AI avg length

[0575] 88-107 1.16 20

[0576] DEX0267_(—)175 Antigenicity Index(Jameson-Wolf

[0577] positions AI avg length

[0578] 108-119 1.10 12

[0579] DEX0267_(—)179 Antigenicity Index(Jameson-Wolf)

[0580] positions AI avg length

[0581] 358-388 1.20 31

[0582] 311-342 1.11 32

[0583] 218-230 1.05 13

[0584] 18-37 1.00 20

[0585] DEX0267_(—)182 Antigenicity Index(Jameson-Wolf)

[0586] positions AI avg length

[0587] 162-176 1.11 15

[0588] DEX0267_(—)191 Antigenicity Index(Jameson-Wolf)

[0589] positions AI avg length

[0590] 5-33 1.12 29

[0591] DEX0267_(—)192 Antigenicity Index(Jameson-Wolf)

[0592] positions AI avg length

[0593] 187-207 1.11 21

[0594] 44-56 1.09 13

[0595] DEX0267_(—)194 Antigenicity Index(Jameson-Wolf)

[0596] positions AI avg length

[0597] 46-61 1.15 16

[0598] 74-96 1.13 23

[0599] DEX0267_(—)196 Antigenicity Index(Jameson-Wolf)

[0600] positions AI avg length

[0601] 8-29 1.16 22

[0602] DEX0267_(—)197 Antigenicity Index(Jameson-Wolf)

[0603] positions AI avg length

[0604] 26-35 1.06 10

[0605] 90-101 1.05 12

[0606] DEX0267_(—)199 Antigenicity Index(Jameson-Wolf)

[0607] positions AI avg length

[0608] 5-25 1.14 21

[0609] 27-42 1.10 16

[0610] DEX0267_(—)201 Antigenicity Index(Jameson-Wolf)

[0611] positions AI avg length

[0612] 123-138 1.15 16

[0613] DEX0267_(—)202 Antigenicity Index(Jameson-Wolf)

[0614] positions AI avg length

[0615] 15-32 1.25 18

[0616] DEX0267_(—)205 Antigenicity Index(Jameson-Wolf)

[0617] positions AI avg length

[0618] 14-23 1.03 10

[0619] DEX0267_(—)206 Antigenicity Index(Jameson-Wolf)

[0620] positions AI avg length

[0621] 8-23 1.19 16

[0622] DEX0267_(—)208 Antigenicity Index(Jameson-Wolf)

[0623] positions AI avg length

[0624] 30-39 1.23 10

[0625] 11-27 1.07 17

[0626] DEX0267_(—)210 Antigenicity Index(Jameson-Wolf)

[0627] positions AI avg length

[0628] 56-67 1.17 12

[0629] DEX0267_(—)211 Antigenicity Index(Jameson-Wolf)

[0630] positions AI avg length

[0631] 35-44 1.05 10

[0632] DEX0267_(—)212 Antigenicity Index(Jameson-Wolf)

[0633] positions AI avg length

[0634] 80-89 1.12 10

[0635] 43-68 1.07 26

[0636] 95-108 1.04 14

[0637] DEX0267_(—)213 Antigenicity Index(Jameson-Wolf)

[0638] positions AI avg length

[0639] 114-123 1.33 10

[0640] DEX0267_(—)214 Antigenicity Index(Jameson-Wolf)

[0641] positions AI avg length

[0642] 22-36 0.15 15

[0643] DEX0267_(—)215 Antigenicity Index(Jameson-Wolf)

[0644] positions AI avg length

[0645] 17-27 1.00 11

[0646] DEX0267_(—)218 Antigenicity Index(Jameson-Wolf)

[0647] positions AI avg length

[0648] 26-46 1.10 21

[0649] Examples of post-translational modifications (PTMs) of the BSPsof this invention are listed below. In addition, antibodies thatspecifically bind such post-translational modifications may be useful asa diagnostic or as therapeutic. Using the ProSite database (Bairoch etal., Nucleic Acids Res. 25(1):217-221 (1997), the contents of which areincorporated by reference), the following PTMs were predicted for theLSPs of the invention(http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_prosite.htmlmost recently accessed Oct. 23, 2001). For full definitions of the PTMssee http://www.expasy.org/cgi-bin/prosite-list.pl most recently accessedOct. 23, 2001.

[0650] DEX0267_(—)117 Camp_Phospho_Site 10-13;

[0651] DEX0267_(—)118 Ck2_Phospho_Site 45-48; Myristyl 27-32;32-37;Pkc_Phospho_Site 13-15;99-101;

[0652] DEX0267_(—)119 Ck2_Phospho_Site 32-35; Myristyl 49-54;

[0653] DEX0267_(—)120 Amidation 86-89; Asn_Glycosylation 90-93;Camp_Phospho_Site 105-108; Ck2_Phospho_Site 125-128;174-177; Myristyl71-76;159-164;184-189; Pkc_Phospho_Site 103-105;

[0654] DEX0267121 Asn_Glycosylation 27-30; Ck2_Phospho_Site 29-32;Pkc_Phospho_Site 14-16;

[0655] DEX0267122 Camp_Phospho_Site 73-76; Ck2_Phospho_Site23-26;102-105; Myristyl 4-9;55-60;84-89; Pkc_Phospho_Site23-25;69-71;88-90;113-115;

[0656] DEX0267_(—)124 Asn_Glycosylation 36-39;

[0657] DEX0267125 Asn_Glycosylation 56-59;268-271 ;283-286;Camp_Phospho_Site 191-194;221-224; Ck2_Phospho_Site106-109;136-139;147-150;255-258; Glycosaminoglycan 231-234;Pkc_Phospho_Site 66-68;69-71;147-149;

[0658] DEX0267_(—)126 Myristyl 16-21;55-60;

[0659] DEX0267_(—)127 Asn_Glycosylation 23-26;

[0660] DEX0267_(—)128 Cytochrome_C 36-41; Myristyl 2-7;4-9;63-68;

[0661] DEX0267_(—)129 Camp_Phospho_Site 9-12; Ck2_Phospho_Site60-63;76-79; Myristyl 28-33; Pkc_Phospho_Site 7-9;12-14;46-48;

[0662] DEX0267_(—)130 Myristyl 16-21;38-43;

[0663] DEX0267_(—)131 Amidation19-22;Ck2_Phospho_Site 76-79;Myristyl36-41;37-42;Pkc_Phospho_Site 13-15;76-78;

[0664] DEX0267_(—)132 Myristyl 15-20;

[0665] DEX0267_(—)133 Asn_Glycosylation 98-101;289-292;322-325;Ck2_Phospho_(—Site) 2-5;80-83;199-202;217-220; Myristyl 8-13;41-46;97-102; 187-192;251-256;252-257;287-292;484-489; Pkc_Phospho_Site28-30;29-31;34-36;110-112;113-115;124-126;199-201;239-241;296-298;327-329;

[0666] DEX0267_(—)134 Myristyl 53-58;

[0667] DEX0267_(—)135 Myristyl 61-66;

[0668] DEX0267_(—)136 Asn_Glycosylation 65-68; Camp_Phospho_Site20-23;26-29; Myristyl 46-51; Pkc_Phospho_Site 23-25;

[0669] DEX0267_(—)137 Asn_Glycosylation 82-85;85-88; Ck2_Phospho_Site15-18;33-36;48-51; Myristyl 27-32; Pkc_Phospho_Site15-17;23-25;57-59;81-83;

[0670] DEX0267_(—)138 Myristyl 38-43;

[0671] DEX0267_(—)139 Ck2_Phospho_Site 7-10;

[0672] DEX0267_(—)140 Myristyl 13-18;27-32;

[0673] DEX0267_(—)141 Camp_Phospho_Site 78-81; Pkc_Phospho_Site99-101;105-107;

[0674] DEX0267_(—)142 Myristyl 24-29; Pkc_Phospho_Site 17-19;49-51;

[0675] DEX0267_(—)143 Amidation 64-67;149-152; Camp_Phospho_Site99-102;181-184; Myristyl 42-47;45-50;212-217;213-218; Pkc_Phospho_Site14-16;97-99;112-114;131-133;132-134;159-161;

[0676] DEX0267_(—)144 Ck2_Phospho_Site 3-6; Pkc_Phospho_Site 3-5;9-11;

[0677] DEX0267_(—)145 Amidation 178-181; Ck2_Phospho_Site 274-277;Myristyl 39-44;102-107;174-179;197-202; Pkc_Phospho_Site215-217;247-249;278-280; Prokar_Lipoprotein 30-40; Rgd 166-168;183-185;

[0678] DEX0267_(—)146 Ck2_Phospho_Site 16-19;86-89; Pkc_Phospho_Site79-81;92-94;

[0679] DEX0267_(—)147 Ck2_Phospho_Site 36-39; Myristyl 72-77;Pkc_Phospho_Site 29-31;42-44;45-47;

[0680] DEX0267_(—)148 Asn_Glycosylation 13-16; Camp_Phospho_Site 28-31;Ck2_Phospho_Site 75-78;

[0681] DEX0267_(—)149 Ck2_Phospho_Site 3-6; Myristyl 9-14;Pkc_Phospho_Site 27-29;

[0682] DEX0267_(—)150 Ck2_Phospho_Site 9-12;21-24; Pkc_Phospho_Site18-20;28-30;34-36;

[0683] DEX0267_(—)151 Myristyl 22-27;

[0684] DEX0267_(—)152 Glycosarninoglycan 3-6;9-12;

[0685] DEX0267_(—)153 Amidation 67-70; Camp_Phospho_Site 69-72; Myristyl64-69; Pkc_Phospho_Site 30-32;56-58;

[0686] DEX0267_(—)154 Asn_Glycosylation 12-15; Myristyl 51-56;

[0687] DEX0267_(—)155 Asn_Glycosylation 65-68; Ck2_Phospho_Site24-27;50-53; Myristyl 98-103; Pkc_Phospho_Site 57-59;70-72;

[0688] DEX0267_(—)156 Pkc_Phospho_Site 10-12;64-66;

[0689] DEX0267_(—)157 Asn_Glycosylation 27-30;

[0690] DEX0267_(—)158 Ck2_Phospho_Site 125-128; Pkc_Phospho_Site32-34;77-79;125-127;

[0691] DEX0267_(—)159 Ck2_Phospho_Site 53-56;97-100; Pkc_Phospho_Site93-95;

[0692] DEX0267_(—)160 Ck2_Phospho_Site 5-8;

[0693] DEX0267_(—)162 Amidation 19-22; Camp_Phospho_Site 22-25; Myristyl9-14; Rgd 79-81;

[0694] DEX0267_(—)163 Ck2_Phospho_Site 37-40;

[0695] DEX0267_(—)165 Myristyl 24-29;

[0696] DEX0267_(—)166 Myristyl 17-22;

[0697] DEX0267_(—)167 Ck2_Phospho_Site 59-62;

[0698] DEX0267_(—)168 Asn_Glycosylation 64-67; Myristyl 62-67;Tyr_Phospho_Site 47-54;

[0699] DEX0267_(—)169 Amidation 179-182; Camp_Phospho_Site11-14;68-71;69-72;189-192; Ck2_Phospho_Site 42-45;80-83; 116-119;124-127; Myristyl 144-149; Pkc_Phospho_Site7-9;17-19;42-44;65-67;72-74;80-82;116-118;124-126;157-159;187-189;192-194;203-205; Rgd 38-40;183-185;

[0700] DEX0267_(—)170 Asn_Glycosylation 50-53; Pkc_Phospho_Site 28-30;

[0701] DEX0267_(—)171 Ck2_Phospho_Site 2-5;120-123;140-143; Myristyl73-78;79-84;110-115; Pkc_Phospho_Site 8-10;19-21 ;39-41 ;92-94;120-122;

[0702] DEX0267_(—)172 Myristyl 5-10;

[0703] DEX0267_(—)173 Ck2_Phospho_Site 40-43; Pkc_Phospho_Site 13-15;

[0704] DEX0267_(—)175 Ck2_Phospho_Site 4-7; Myristyl 115-120;121-126;Pkc_Phospho_Site 93-95;

[0705] DEX0267_(—)176 Myristyl 108-113;

[0706] DEX0267_(—)178 Amidation 67-70;94-97; 122-125; Camp_Phospho_Site32-35;57-60;75-78;103-106;114-117;119-122;175-178; Ck2_Phospho_Site2-5;60-63;82-85;86-89;132-135;143-146;155-158;183-186;195-198;204-207;Pkc_Phospho_Site26-28;31-33;37-39;41-43;56-58;86-88;106-108;117-119;122-124;132-134;143-145;178-180;194-196;195-197;Tyr_Phospho_Site 142-149;

[0707] DEX0267_(—)179 Asn_Glycosylation 393-396; Camp_Phospho_Site406-409; Ck2_Phospho_Site 46-49;143-146;164-167;238-241 ;312-315;362-365;384-387; Glycosarninoglycan 214-217; Myristyl52-57;156-161;160-165 ;274-279; Pkc_Phospho_Site157-159;208-210;222-224;349-351 ;408-410;409-411 ;418-420;

[0708] DEX0267_(—)180 Ck2_Phospho_Site 36-39;

[0709] DEX0267_(—)181 Ck2_Phospho_Site 46-49;

[0710] DEX0267_(—)182 Asn_Glycosylation 172-175; Ck2_Phospho_Site141-144;170-173; Myristyl 176-181; Pkc_Phospho_Site 29-31;67-69;141-143; Prokar_Lipoprotein 110-120;

[0711] DEX0267_(—)184 Ck2_Phospho_Site 22-25; Myristyl 99-104;

[0712] DEX0267_(—)185 Ck2_Phospho_Site 21-24;

[0713] DEX0267_(—)186 Asn_Glycosylation 17-20; Pkc_Phospho_Site31-33;41-43;50-52;

[0714] DEX0267_(—)189 Myristyl 6-11;

[0715] DEX0267_(—)190 Camp_Phospho_Site 62-65;63-66; Myristyl 13-18;Pkc_Phospho_Site 14-16;66-68;72-74;

[0716] DEX0267_(—)191 Asn_Glycosylation 11-14;34-37; Pkc_Phospho_Site17-19;36-38;

[0717] DEX0267_(—)192 Ck2_Phospho_Site 24-27;148-151;231-234;257-260;Glycosaminoglycan 4-7; Myristyl 5-10;79-84;144-149;149-154;184-189;Pkc_Phospho_Site 9-11;

[0718] DEX0267_(—)193 Myristyl 22-27;26-31; Prokar_Lipoprotein42-52;84-94; Receptor_Cytokines_(—)1 45-57;

[0719] DEX0267_(—)194 Ck2_Phospho_Site 35-38; Myristyl7-12;28-33;50-55;61-66; Pkc_Phospho_Site 31-33;51-53;65-67;126-128;

[0720] DEX0267_(—)195 Ck2_Phospho_Site 35-38; Myristyl 31-36;74-79;

[0721] DEX0267_(—)196 Camp_Phospho_Site 20-23; Ck2_Phospho_Site 23-26;Myristyl 29-34;

[0722] DEX0267_(—)197 Asn_Glycosylation 93-96;94-97; Ck2_Phospho_Site9-12;89-92;162-165;229-232; Pkc_Phospho_Site 72-74;124-126; 143-145;

[0723] DEX0267_(—)199 Camp_Phospho_Site 34-37; Myristyl 6-11;Pkc_Phospho_Site 18-20;37-39;

[0724] DEX0267_(—)200 Pkc_Phospho_Site 21-23;112-114; Prokar_Lipoprotein230-240;

[0725] DEX0267_(—)201 Amnidation 124-127; Ck2_Phospho_Site 68-71;Pkc_Phospho_Site 137-139;

[0726] DEX0267_(—)202 Asn_Glycosylation 53-56; Myristyl 30-35;Pkc_Phospho_Site 3-5;15-17;

[0727] DEX0267_(—)204 Ck2_Phospho_Site 56-59;

[0728] DEX0267_(—)205 Ck2_Phospho_Site 29-32;

[0729] DEX0267_(—)206 Ck2_Phospho_Site 16-19;23-26; Myristyl 21-26;

[0730] DEX0267_(—)207 Asn_Glycosylation 8-11; Ck2_Phospho_Site13-16;31-34; Myristyl 19-24;

[0731] DEX0267_(—)208 Amidation 34-37; Myristyl 8-13;9-14;61-66;Pkc_Phospho_Site 45-47;53-55;

[0732] DEX0267_(—)209 Myristyl 25-30;35-40;3944; Pkc_Phospho_Site13-15;57-59;

[0733] DEX0267_(—)210 Asn_Glycosylation 26-29; Pkc_Phospho_Site15-17;46-48;65-67; Tyr_Phospho_Site 73-80;

[0734] DEX0267_(—)211 Ck2_Phospho_Site 6-9;58-61; Glycosaminoglycan92-95; Myristyl 15-20;59-64;86-91; Pkc_Phospho_Site 120-122;Tyr_Phospho_Site 111-119;

[0735] DEX0267_(—)212 Camp_Phospho_Site 58-61;113-116; Myristyl 100-105;Pkc_Phospho_Site 61-63;97-99;107-109;116-118;

[0736] DEX0267_(—)213 Camp_Phospho_Site 115-118; Myristyl 126-131;Pkc_Phospho_Site 40-42;114-116;118-120; Tyr_Phospho_Site 81-88;

[0737] DEX0267_(—)214 Amidation 27-30; Ck2_Phospho_Site5-8;76-79;111-114; Myristyl 70-75; Pkc_Phospho_Site 23-25;85-87;111-113;

[0738] DEX0267_(—)215 Ck2_Phospho_Site 54-57; Pkc_Phospho_Site 25-27;

[0739] DEX0267_(—)217 Camp_Phospho_Site 87-90; Ck2_Phospho_Site27-30;104-107;105-108; Myristyl 5-10;9-14; Pkc_Phospho_Site26-28;101-103;104-106;

Example 6 Method of Determining Alterations in a Gene Corresponding to aPolynucleotide

[0740] RNA is isolated from individual patients or from a family ofindividuals that have a phenotype of interest. cDNA is then generatedfrom these RNA samples using protocols known in the art. See, Sambrook(2001), supra. The cDNA is then used as a template for PCR, employingprimers surrounding regions of interest in SEQ ID NO: 1 through 115.Suggested PCR conditions consist of 35 cycles at 95° C. for 30 seconds;60-120 seconds at 52-58° C.; and 60-120 seconds at 70° C., using buffersolutions described in Sidransky et al., Science 252(5006): 706-9(1991). See also Sidransky et al., Science 278(5340): 1054-9 (1997).

[0741] PCR products are then sequenced using primers labeled at their 5′end with T4 polynucleotide kinase, employing SequiTherm Polymerase.(Epicentre Technologies). The intron-exon borders of selected exons isalso determined and genomic PCR products analyzed to confirm theresults. PCR products harboring suspected mutations are then cloned andsequenced to validate the results of the direct sequencing. PCR productsis cloned into T-tailed vectors as described in Holton et al., NucleicAcids Res., 19: 1156 (1991) and sequenced with T7 polymerase (UnitedStates Biochemical). Affected individuals are identified by mutationsnot present in unaffected individuals.

[0742] Genomic rearrangements may also be determined. Genomic clones arenick-translated with digoxigenin deoxyuridine 5′ triphosphate(Boehringer Manheim), and FISH is performed as described in Johnson etal., Methods Cell Biol. 35: 73-99 (1991). Hybridization with the labeledprobe is carried out using a vast excess of human cot-1 DNA for specifichybridization to the corresponding genomic locus.

[0743] Chromosomes are counterstained with 4,6-diamino-2-phenylidole andpropidium iodide, producing a combination of C-and R-bands. Alignedimages for precise mapping are obtained using a triple-band filter set(Chroma Technology, Brattleboro, Vt.) in combination with a cooledcharge-coupled device camera (Photometrics, Tucson, Ariz.) and variableexcitation wavelength filters. Id. Image collection, analysis andchromosomal fractional length measurements are performed using the ISeeGraphical Program System. (Inovision Corporation, Durham, N.C.)Chromosome alterations of the genomic region hybridized by the probe areidentified as insertions, deletions, and translocations. Thesealterations are used as a diagnostic marker for an associated disease.

Example 7 Method of Detecting Abnormal Levels of a Polypeptide in aBiological Sample

[0744] Antibody-sandwich ELISAs are used to detect polypeptides in asample, preferably a biological sample. Wells of a microtiter plate arecoated with specific antibodies, at a final concentration of 0.2 to 10μg/ml. The antibodies are either monoclonal or polyclonal and areproduced by the method described above. The wells are blocked so thatnon-specific binding of the polypeptide to the well is reduced. Thecoated wells are then incubated for >2 hours at RT with a samplecontaining the polypeptide. Preferably, serial dilutions of the sampleshould be used to validate results. The plates are then washed threetimes with deionized or distilled water to remove unbound polypeptide.Next, 50 μl of specific antibody-alkaline phosphatase conjugate, at aconcentration of 25-400 ng, is added and incubated for 2 hours at roomtemperature. The plates are again washed three times with deionized ordistilled water to remove unbound conjugate. 75 μl of4-methylumbelliferyl phosphate (MUP) or p-nitrophenyl phosphate (NPP)substrate solution are added to each well and incubated 1 hour at roomtemperature.

[0745] The reaction is measured by a microtiter plate reader. A standardcurve is prepared, using serial dilutions of a control sample, andpolypeptide concentrations are plotted on the X-axis (log scale) andfluorescence or absorbance on the Y-axis (linear scale). Theconcentration of the polypeptide in the sample is calculated using thestandard curve.

Example 8 Formulating a Polypeptide

[0746] The secreted polypeptide composition will be formulated and dosedin a fashion consistent with good medical practice, taking into accountthe clinical condition of the individual patient (especially the sideeffects of treatment with the secreted polypeptide alone), the site ofdelivery, the method of administration, the scheduling ofadministration, and other factors known to practitioners. The “effectiveamount” for purposes herein is thus determined by such considerations.

[0747] As a general proposition, the total pharmaceutically effectiveamount of secreted polypeptide administered parenterally per dose willbe in the range of about 1, μg/kg/day to 10 mg/kg/day of patient bodyweight, although, as noted above, this will be subject to therapeuticdiscretion. More preferably, this dose is at least 0.01 mg/kg/day, andmost preferably for humans between about 0.01 and 1 mg/kg/day for thehormone. If given continuously, the secreted polypeptide is typicallyadministered at a dose rate of about 1 μg/kg/hour to about 50mg/kg/hour, either by 1-4 injections per day or by continuoussubcutaneous infusions, for example, using a mini-pump. An intravenousbag solution may also be employed. The length of treatment needed toobserve changes and the interval following treatment for responses tooccur appears to vary depending on the desired effect.

[0748] Pharmaceutical compositions containing the secreted protein ofthe invention are administered orally, rectally, parenterally,intracistemally, intravaginally, intraperitoneally, topically (as bypowders, ointments, gels, drops or transdermal patch), bucally, or as anoral or nasal spray. “Pharmaceutically acceptable carrier” refers to anon-toxic solid, semisolid or liquid filler, diluent, encapsulatingmaterial or formulation auxiliary of any type. The term “parenteral” asused herein refers to modes of administration which include intravenous,intramuscular, intraperitoneal, intrastemal, subcutaneous andintraarticular injection and infusion.

[0749] The secreted polypeptide is also suitably administered bysustained-release systems. Suitable examples of sustained-releasecompositions include semipermeable polymer matrices in the form ofshaped articles, e. g., films, or microcapsules. Sustained-releasematrices include polylactides (U. S. Pat. No.3,773,919, EP 58,481),copolymers of L-glutamic acid and gamma-ethyl-L-glutamate (Sidman, U. etal., Biopolymers 22: 547-556 (1983)), poly (2-hydroxyethyl methacrylate)(R. Langer et al., J. Biomed. Mater. Res. 15: 167-277 (1981), and R.Langer, Chem. Tech. 12: 98-105 (1982)), ethylene vinyl acetate (R.Langer et al.) or poly-D-(−)-3-hydroxybutyric acid (EP 133,988).Sustained-release compositions also include liposomally entrappedpolypeptides. Liposomes containing the secreted polypeptide are preparedby methods known per se: D E Epstein et al., Proc. Natl. Acad. Sci. USA82: 3688-3692 (1985); Hwang et al., Proc. Natl. Acad. Sci. USA 77:4030-4034 (1980); EP 52,322; EP 36,676; EP 88,046; EP 143,949; EP142,641; Japanese Pat. Appl. 83-118008; U.S. Pat. Nos. 4,485,045 and4,544,545; and EP 102,324. Ordinarily, the liposomes are of the small(about 200-800 Angstroms) unilamellar type in which the lipid content isgreater than about 30 mol. percent cholesterol, the selected proportionbeing adjusted for the optimal secreted polypeptide therapy.

[0750] For parenteral administration, in one embodiment, the secretedpolypeptide is formulated generally by mixing it at the desired degreeof purity, in a unit dosage injectable form (solution, suspension, oremulsion), with a pharmaceutically acceptable carrier, I. e., one thatis non-toxic to recipients at the dosages and concentrations employedand is compatible with other ingredients of the formulation.

[0751] For example, the formulation preferably does not includeoxidizing agents and other compounds that are known to be deleterious topolypeptides. Generally, the formulations are prepared by contacting thepolypeptide uniformly and intimately with liquid carriers or finelydivided solid carriers or both. Then, if necessary, the product isshaped into the desired formulation. Preferably the carrier is aparenteral carrier, more preferably a solution that is isotonic with theblood of the recipient. Examples of such carrier vehicles include water,saline, Ringer's solution, and dextrose solution. Non-aqueous vehiclessuch as fixed oils and ethyl oleate are also useful herein, as well asliposomes.

[0752] The carrier suitably contains minor amounts of additives such assubstances that enhance isotonicity and chemical stability. Suchmaterials are non-toxic to recipients at the dosages and concentrationsemployed, and include buffers such as phosphate, citrate, succinate,acetic acid, and other organic acids or their salts; antioxidants suchas ascorbic acid; low molecular weight (less than about ten residues)polypeptides, e. g., polyarginine or tripeptides; proteins, such asserum albumin, gelatin, or immunoglobulins; hydrophilic polymers such aspolyvinylpyrrolidone; amino acids, such as glycine, glutamic acid,aspartic acid, or arginine; monosaccharides, disaccharides, and othercarbohydrates including cellulose or its derivatives, glucose, manose,or dextrins; chelating agents such as EDTA; sugar alcohols such asmannitol or sorbitol; counterions such as sodium; and/or nonionicsurfactants such as polysorbates, poloxamers, or PEG.

[0753] The secreted polypeptide is typically formulated in such vehiclesat a concentration of about 0.1 mg/ml to 100 mg/ml, preferably 1-10mg/ml, at a pH of about 3 to 8. It will be understood that the use ofcertain of the foregoing excipients, carriers, or stabilizers willresult in the formation of polypeptide salts.

[0754] Any polypeptide to be used for therapeutic administration can besterile. Sterility is readily accomplished by filtration through sterilefiltration membranes (e. g., 0.2 micron membranes). Therapeuticpolypeptide compositions generally are placed into a container having asterile access port, for example, an intravenous solution bag or vialhaving a stopper pierceable by a hypodermic injection needle.

[0755] Polypeptides ordinarily will be stored in unit or multi-dosecontainers, for example, sealed ampules or vials, as an aqueous solutionor as a lyophilized formulation for reconstitution. As an example of alyophilized formulation, 10-ml vials are filled with 5 ml ofsterile-filtered 1% (w/v) aqueous polypeptide solution, and theresulting mixture is lyophilized. The infusion solution is prepared byreconstituting the lyophilized polypeptide using bacteriostaticWater-for-Injection.

[0756] The invention also provides a pharmaceutical pack or kitcomprising one or more containers filled with one or more of theingredients of the pharmaceutical compositions of the invention.Associated with such container (s) can be a notice in the formprescribed by a governmental agency regulating the manufacture, use orsale of pharmaceuticals or biological products, which notice reflectsapproval by the agency of manufacture, use or sale for humanadministration. In addition, the polypeptides of the present inventionmay be employed in conjunction with other therapeutic compounds.

Example 9 Method of Treating Decreased Levels of the Polypeptide

[0757] It will be appreciated that conditions caused by a decrease inthe standard or normal expression level of a secreted protein in anindividual can be treated by administering the polypeptide of thepresent invention, preferably in the secreted form. Thus, the inventionalso provides a method of treatment of an individual in need of anincreased level of the polypeptide comprising administering to such anindividual a pharmaceutical composition comprising an amount of thepolypeptide to increase the activity level of the polypeptide in such anindividual.

[0758] For example, a patient with decreased levels of a polypeptidereceives a daily dose 0.1-100 μg/kg of the polypeptide for sixconsecutive days. Preferably, the polypeptide is in the secreted form.The exact details of the dosing scheme, based on administration andformulation, are provided above.

Example 10 Method of Treating Increased Levels of the Polypeptide

[0759] Antisense technology is used to inhibit production of apolypeptide of the present invention. This technology is one example ofa method of decreasing levels of a polypeptide, preferably a secretedform, due to a variety of etiologies, such as cancer.

[0760] For example, a patient diagnosed with abnormally increased levelsof a polypeptide is administered intravenously antisense polynucleotidesat 0.5, 1.0, 1.5, 2.0 and 3.0 mg/kg day for 21 days. This treatment isrepeated after a 7-day rest period if the treatment was well tolerated.The formulation of the antisense polynucleotide is provided above.

Example 11 Method of Treatment Using Gene Therapy

[0761] One method of gene therapy transplants fibroblasts, which arecapable of expressing a polypeptide, onto a patient. Generally,fibroblasts are obtained from a subject by skin biopsy. The resultingtissue is placed in tissue-culture medium and separated into smallpieces. Small chunks of the tissue are placed on a wet surface of atissue culture flask, approximately ten pieces are placed in each flask.The flask is turned upside down, closed tight and left at roomtemperature over night. After 24 hours at room temperature, the flask isinverted and the chunks of tissue remain fixed to the bottom of theflask and fresh media (e. g., Ham's F12 media, with 10% FBS, penicillinand streptomycin) is added. The flasks are then incubated at 37° C. forapproximately one week.

[0762] At this time, fresh media is added and subsequently changed everyseveral days. After an additional two weeks in culture, a monolayer offibroblasts emerge. The monolayer is typsinized and scaled into largerflasks. pMV-7 (Kirschmeier, P. T. et al., DNA, 7: 219-25 (1988)),flanked by the long terminal repeats of the Moloney murine sarcomavirus, is digested with EcoRI and HindIII and subsequently treated withcalf intestinal phosphatase. The linear vector is fractionated onagarose gel and purified, using glass beads.

[0763] The cDNA encoding a polypeptide of the present invention can beamplified using PCR primers which correspond to the 5′ and 3′endsequences respectively as set forth in Example 1. Preferably, the5′primer contains an EcoRI site and the 3′primer includes a HindIIIsite. Equal quantities of the Moloney murine sarcoma virus linearbackbone and the amplified EcoRi and Hindlil fragment are addedtogether, in the presence of T4 DNA ligase. The resulting mixture ismaintained under conditions appropriate for ligation of the twofragments. The ligation mixture is then used to transform bacteria HB101, which are then plated onto agar containing kanamycin for thepurpose of confirming that the vector has the gene of interest properlyinserted.

[0764] The amphotropic pA317 or GP+am12 packaging cells are grown intissue culture to confluent density in Dulbecco's Modified Eagles Medium(DMEM) with 10% calf serum (CS), penicillin and streptomycin. The MSVvector containing the gene is then added to the media and the packagingcells transduced with the vector. The packaging cells now produceinfectious viral particles containing the gene (the packaging cells arenow referred to as producer cells).

[0765] Fresh media is added to the transduced producer cells, andsubsequently, the media is harvested from a 10 cm plate of confluentproducer cells. The spent media, containing the infectious viralparticles, is filtered through a millipore filter to remove detachedproducer cells and this media is then used to infect fibroblast cells.Media is removed from a sub-confluent plate of fibroblasts and quicklyreplaced with the media from the producer cells. This media is removedand replaced with fresh media.

[0766] If the titer of virus is high, then virtually all fibroblastswill be infected and no selection is required. If the titer is very low,then it is necessary to use a retroviral vector that has a selectablemarker, such as neo or his. Once the fibroblasts have been efficientlyinfected, the fibroblasts are analyzed to determine whether protein isproduced.

[0767] The engineered fibroblasts are then transplanted onto the host,either alone or after having been grown to confluence on cytodex 3microcarrier beads.

Example 12 Method of Treatment Using Gene Therapy-In Vivo

[0768] Another aspect of the present invention is using in vivo genetherapy methods to treat disorders, diseases and conditions. The genetherapy method relates to the introduction of naked nucleic acid (DNA,RNA, and antisense DNA or RNA) sequences into an animal to increase ordecrease the expression of the polypeptide.

[0769] The polynucleotide of the present invention may be operativelylinked to a promoter or any other genetic elements necessary for theexpression of the polypeptide by the target tissue. Such gene therapyand delivery techniques and methods are known in the art, see, forexample, WO 90/11092, WO 98/11779; U.S. Pat. Nos. 5,693,622; 5,705,151;5,580,859; Tabata H. et al. (1997) Cardiovasc. Res. 35 (3): 470-479,Chao J et al. (1997) Pharmacol. Res. 35 (6): 517-522, Wolff J. A. (1997)Neuromuscul. Disord. 7 (5): 314-318, Schwartz B. et al. (1996) GeneTher. 3 (5): 405-411, Tsurumi Y. et al. (1996) Circulation 94 (12):3281-3290 (incorporated herein by reference).

[0770] The polynucleotide constructs may be delivered by any method thatdelivers injectable materials to the cells of an animal, such as,injection into the interstitial space of tissues (heart, muscle, skin,lung, liver, intestine and the like). The polynucleotide constructs canbe delivered in a pharmaceutically acceptable liquid or aqueous carrier.

[0771] The term “naked” polynucleotide, DNA or RNA, refers to sequencesthat are free from any delivery vehicle that acts to assist, promote, orfacilitate entry into the cell, including viral sequences, viralparticles, liposome formulations, lipofectin or precipitating agents andthe like. However, the polynucleotides of the present invention may alsobe delivered in liposome formulations (such as those taught in FelgnerP. L. et al. (1995) Ann. NY Acad. Sci. 772: 126-139 and Abdallah B. etal. (1995) Biol. Cell 85 (1): 1-7) which can be prepared by methods wellknown to those skilled in the art.

[0772] The polynucleotide vector constructs used in the gene therapymethod are preferably constructs that will not integrate into the hostgenome nor will they contain sequences that allow for replication. Anystrong promoter known to those skilled in the art can be used fordriving the expression of DNA. Unlike other gene therapies techniques,one major advantage of introducing naked nucleic acid sequences intotarget cells is the transitory nature of the polynucleotide synthesis inthe cells. Studies have shown that non-replicating DNA sequences can beintroduced into cells to provide production of the desired polypeptidefor periods of up to six months.

[0773] The polynucleotide construct can be delivered to the interstitialspace of tissues within the an animal, including of muscle, skin, brain,lung, liver, spleen, bone marrow, thymus, heart, lymph, blood, bone,cartilage, pancreas, kidney, gall bladder, stomach, intestine, testis,ovary, uterus, rectum, nervous system, eye, gland, and connectivetissue. Interstitial space of the tissues comprises the intercellularfluid, mucopolysaccharide matrix among the reticular fibers of organtissues, elastic fibers in the walls of vessels or chambers, collagenfibers of fibrous tissues, or that same matrix within connective tissueensheathing muscle cells or in the lacunae of bone. It is similarly thespace occupied by the plasma of the circulation and the lymph fluid ofthe lymphatic channels. Delivery to the interstitial space of muscletissue is preferred for the reasons discussed below. They may beconveniently delivered by injection into the tissues comprising thesecells. They are preferably delivered to and expressed in persistent,non-dividing cells which are differentiated, although delivery andexpression may be achieved in non-differentiated or less completelydifferentiated cells, such as, for example, stem cells of blood or skinfibroblasts. In vivo muscle cells are particularly competent in theirability to take up and express polynucleotides.

[0774] For the naked polynucleotide injection, an effective dosageamount of DNA or RNA will be in the range of from about 0.05 μg/kg bodyweight to about 50 mg/kg body weight. Preferably the dosage will be fromabout 0.005 mg/kg to about 20 mg/kg and more preferably from about 0.05mg/kg to about 5 mg/kg. Of course, as the artisan of ordinary skill willappreciate, this dosage will vary according to the tissue site ofinjection. The appropriate and effective dosage of nucleic acid sequencecan readily be determined by those of ordinary skill in the art and maydepend on the condition being treated and the route of administration.The preferred route of administration is by the parenteral route ofinjection into the interstitial space of tissues. However, otherparenteral routes may also be used, such as, inhalation of an aerosolformulation particularly for delivery to lungs or bronchial tissues,throat or mucous membranes of the nose. In addition, nakedpolynucleotide constructs can be delivered to arteries duringangioplasty by the catheter used in the procedure.

[0775] The dose response effects of injected polynucleotide in muscle invivo is determined as follows. Suitable template DNA for production ofmRNA coding for polypeptide of the present invention is prepared inaccordance with a standard recombinant DNA methodology. The templateDNA, which may be either circular or linear, is either used as naked DNAor complexed with liposomes. The quadriceps muscles of mice are theninjected with various amounts of the template DNA.

[0776] Five to six week old female and male Balb/C mice are anesthetizedby intraperitoneal injection with 0.3 ml of 2.5% Avertin. A 1.5 cmincision is made on the anterior thigh, and the quadriceps muscle isdirectly visualized. The template DNA is injected in 0.1 ml of carrierin a 1 cc syringe through a 27 gauge needle over one minute,approximately 0.5 cm from the distal insertion site of the muscle intothe knee and about 0.2 cm deep. A suture is placed over the injectionsite for future localization, and the skin is closed with stainlesssteel clips.

[0777] After an appropriate incubation time (e. g., 7 days) muscleextracts are prepared by excising the entire quadriceps. Every fifth 15μm cross-section of the individual quadriceps muscles is histochemicallystained for protein expression. A time course for protein expression maybe done in a similar fashion except that quadriceps from different miceare harvested at different times. Persistence of DNA in muscle followinginjection may be determined by Southern blot analysis after preparingtotal cellular DNA and HIRT supernatants from injected and control mice.

[0778] The results of the above experimentation in mice can be use toextrapolate proper dosages and other treatment parameters in humans andother animals using naked DNA.

Example 13 Transgenic Animals

[0779] The polypeptides of the invention can also be expressed intransgenic animals. Animals of any species, including, but not limitedto, mice, rats, rabbits, hamsters, guinea pigs, pigs, micro-pigs, goats,sheep, cows and non-human primates, e. g., baboons, monkeys, andchimpanzees may be used to generate transgenic animals. In a specificembodiment, techniques described herein or otherwise known in the art,are used to express polypeptides of the invention in humans, as part ofa gene therapy protocol.

[0780] Any technique known in the art may be used to introduce thetransgene (i. e., polynucleotides of the invention) into animals toproduce the founder lines of transgenic animals. Such techniquesinclude, but are not limited to, pronuclear microinjection (Paterson etal., Appl. Microbiol. Biotechnol. 40: 691-698 (1994); Carver et al.,Biotechnology (NY) 11: 1263-1270 (1993); Wright et al., Biotechnology(NY) 9: 830-834 (1991); and Hoppe et al., U.S. Pat. No. 4,873,191(1989)); retrovirus mediated gene transfer into germ lines (Van derPutten et al., Proc. Natl. Acad. Sci., USA 82: 6148-6152 (1985)),blastocysts or embryos; gene targeting in embryonic stem cells (Thompsonet al., Cell 56: 313-321 (1989)); electroporation of cells or embryos(Lo, 1983, Mol Cell. Biol. 3: 1803-1814 (1983)); introduction of thepolynucleotides of the invention using a gene gun (see, e. g., Ulmer etal., Science 259: 1745 (1993); introducing nucleic acid constructs intoembryonic pleuripotent stem cells and transferring the stem cells backinto the blastocyst; and sperm mediated gene transfer (Lavitrano et al.,Cell 57: 717-723 (1989); etc. For a review of such techniques, seeGordon, “Transgenic Animals,” Intl. Rev. Cytol. 115: 171-229 (1989),which is incorporated by reference herein in its entirety.

[0781] Any technique known in the art may be used to produce transgenicclones containing polynucleotides of the invention, for example, nucleartransfer into enucleated oocytes of nuclei from cultured embryonic,fetal, or adult cells induced to quiescence (Campell et al., Nature 380:64-66 (1996); Wilmut et al., Nature 385: 810813 (1997)).

[0782] The present invention provides for transgenic animals that carrythe transgene in all their cells, as well as animals which carry thetransgene in some, but not all their cells, I. e., mosaic animals orchimeric. The transgene may be integrated as a single transgene or asmultiple copies such as in concatamers, e. g., head-to-head tandems orhead-to-tail tandems. The transgene may also be selectively introducedinto and activated in a particular cell type by following, for example,the teaching of Lasko et al. (Lasko et al., Proc. Natl. Acad. Sci. USA89: 6232-6236 (1992)). The regulatory sequences required for such acell-type specific activation will depend upon the particular cell typeof interest, and will be apparent to those of skill in the art. When itis desired that the polynucleotide transgene be integrated into thechromosomal site of the endogenous gene, gene targeting is preferred.Briefly, when such a technique is to be utilized, vectors containingsome nucleotide sequences homologous to the endogenous gene are designedfor the purpose of integrating, via homologous recombination withchromosomal sequences, into and disrupting the function of thenucleotide sequence of the endogenous gene. The transgene may also beselectively introduced into a particular cell type, thus inactivatingthe endogenous gene in only that cell type, by following, for example,the teaching of Gu et al. (Gu et al., Science 265: 103-106 (1994)). Theregulatory sequences required for such a cell-type specific inactivationwill depend upon the particular cell type of interest, and will beapparent to those of skill in the art.

[0783] Once transgenic animals have been generated, the expression ofthe recombinant gene may be assayed utilizing standard techniques.Initial screening may be accomplished by Southern blot analysis or PCRtechniques to analyze animal tissues to verify that integration of thetransgene has taken place. The level of mRNA expression of the transgenein the tissues of the transgenic animals may also be assessed usingtechniques which include, but are not limited to, Northern blot analysisof tissue samples obtained from the animal, in situ hybridizationanalysis, and reverse transcriptase-PCR (rt-PCR). Samples of transgenicgene-expressing tissue may also be evaluated immunocytochemically orimmunohistochemically using antibodies specific for the transgeneproduct.

[0784] Once the founder animals are produced, they may be bred, inbred,outbred, or crossbred to produce colonies of the particular animal.Examples of such breeding strategies include, but are not limited to:outbreeding of founder animals with more than one integration site inorder to establish separate lines; inbreeding of separate lines in orderto produce compound transgenics that express the transgene at higherlevels because of the effects of additive expression of each transgene;crossing of heterozygous transgenic animals to produce animalshomozygous for a given integration site in order to both augmentexpression and eliminate the need for screening of animals by DNAanalysis; crossing of separate homozygous lines to produce compoundheterozygous or homozygous lines; and breeding to place the transgene ona distinct background that is appropriate for an experimental model ofinterest.

[0785] Transgenic animals of the invention have uses which include, butare not limited to, animal model systems useful in elaborating thebiological function of polypeptides of the present invention, studyingconditions and/or disorders associated with aberrant expression, and inscreening for compounds effective in ameliorating such conditions and/ordisorders.

Example 14 Knock-Out Animals

[0786] Endogenous gene expression can also be reduced by inactivating or“knocking out” the gene and/or its promoter using targeted homologousrecombination. (E. g., see Smithies et al., Nature 317: 230-234 (1985);Thomas & Capecchi, Cell 51: 503512 (1987); Thompson et al., Cell 5:313-321 (1989); each of which is incorporated by reference herein in itsentirety). For example, a mutant, non-functional polynucleotide of theinvention (or a completely unrelated DNA sequence) flanked by DNAhomologous to the endogenous polynucleotide sequence (either the codingregions or regulatory regions of the gene) can be used, with or withouta selectable marker and/or a negative selectable marker, to transfectcells that express polypeptides of the invention in vivo. In anotherembodiment, techniques known in the art are used to generate knockoutsin cells that contain, but do not express the gene of interest.Insertion of the DNA construct, via targeted homologous recombination,results in inactivation of the targeted gene. Such approaches areparticularly suited in research and agricultural fields wheremodifications to embryonic stem cells can be used to generate animaloffspring with an inactive targeted gene (e. g., see Thomas & Capecchi1987 and Thompson 1989, supra). However this approach can be routinelyadapted for use in humans provided the recombinant DNA constructs aredirectly administered or targeted to the required site in vivo usingappropriate viral vectors that will be apparent to those of skill in theart.

[0787] In further embodiments of the invention, cells that aregenetically engineered to express the polypeptides of the invention, oralternatively, that are genetically engineered not to express thepolypeptides of the invention (e. g., knockouts) are administered to apatient in vivo. Such cells may be obtained from the patient (I. e.,animal, including human) or an MHC compatible donor and can include, butare not limited to fibroblasts, bone marrow cells, blood cells (e. g.,lymphocytes), adipocytes, muscle cells, endothelial cells etc. The cellsare genetically engineered in vitro using recombinant DNA techniques tointroduce the coding sequence of polypeptides of the invention into thecells, or alternatively, to disrupt the coding sequence and/orendogenous regulatory sequence associated with the polypeptides of theinvention, e. g., by transduction (using viral vectors, and preferablyvectors that integrate the transgene into the cell genome) ortransfection procedures, including, but not limited to, the use ofplasmids, cosmids, YACs, naked DNA, electroporation, liposomes, etc.

[0788] The coding sequence of the polypeptides of the invention can beplaced under the control of a strong constitutive or inducible promoteror promoter/enhancer to achieve expression, and preferably secretion, ofthe polypeptides of the invention. The engineered cells which expressand preferably secrete the polypeptides of the invention can beintroduced into the patient systemically, e. g., in the circulation, orintraperitoneally.

[0789] Alternatively, the cells can be incorporated into a matrix andimplanted in the body, e. g., genetically engineered fibroblasts can beimplanted as part of a skin graft; genetically engineered endothelialcells can be implanted as part of a lymphatic or vascular graft. (See,for example, Anderson et al. U.S. Pat. No. 5,399,349; and Mulligan &Wilson, U.S. Pat. No. 5,460,959 each of which is incorporated byreference herein in its entirety).

[0790] When the cells to be administered are non-autologous or non-MHCcompatible cells, they can be administered using well known techniqueswhich prevent the development of a host immune response against theintroduced cells. For example, the cells may be introduced in anencapsulated form which, while allowing for an exchange of componentswith the immediate extracellular environment, does not allow theintroduced cells to be recognized by the host immune system.

[0791] Transgenic and “knock-out” animals of the invention have useswhich include, but are not limited to, animal model systems useful inelaborating the biological function of polypeptides of the presentinvention, studying conditions and/or disorders associated with aberrantexpression, and in screening for compounds effective in amelioratingsuch conditions and/or disorders.

[0792] All patents, patent publications, and other published referencesmentioned herein are hereby incorporated by reference in theirentireties as if each had been individually and specificallyincorporated by reference herein. While preferred illustrativeembodiments of the present invention are described, one skilled in theart will appreciate that the present invention can be practiced by otherthan the described embodiments, which are presented for purposes ofillustration only and not by way of limitation. The present invention islimited only by the claims that follow.

1 218 1 1767 DNA Homo sapien 1 cggccgcccg ggcaggtaca agcttttttttttttttttt ttttttttta aaaaactaaa 60 gtcaaatttt tttttttccc ataaaaccgcttctcttttt attaataaaa aaaataaaaa 120 taaaaagtgg aaccaaagag gaaaaggggtggttttaaga ggtggacccg tggtgggaaa 180 gagagaggcg agagggcgtg cgaggacacgagaaagaaca cgcgtgggaa cacgtgggag 240 gtggccccgg gggacacctc gagagagaggcagagagtgg cgtgtattca cacgctctca 300 tcatgagtgg tgacacaccg agactcgcgtggcgccgcgc ggcgtgtgtg tctcccagag 360 agagagagag ggcgtgtgta agatcatcacgcggtgggac actctcagca ggggcggtgt 420 gatgacgccc agtgtgtcgc actctgtgtgccaccgctgt gtgtgagtgt gagagagggc 480 gactattctc ttatagagca gagagacaccctgtgtgaga ctgtgtggga gaaaaagtgt 540 gtcgcgccac cacacacaac tctcccgccagaggctctct gtgtgtgaga gaggagagta 600 gtatataaga ggagggacag cggcggggggtgtatataaa ttttatctca catatttata 660 agccggtgtg tggtgtgatg tgagaggggaggggagagag tgtcatcttc tctcacacag 720 cggagagaga gagacggtgt gtgagggacggcgtgtggta gtttttcttc tcctcgccgc 780 cgaagaagaa gatgttacaa caaaagaagttgtgggggcc gcgcacacca aaataataga 840 aggattgttg tcgtgtgaga taatcctcgaccgcagaggc gcgcctctgc tcttcctcta 900 ttatgaggtg ctacgattaa taccccccacgattgtgttt atataatcac gccgactgtt 960 gctgtctccc gacgaagggg acgggcgaagctcgctccaa tggtgggggg cccccacaaa 1020 gaggagcaac aaagaggaga acgacgtggtagcagcacgt cataataaag acgggttgta 1080 ctaacgaggg ggggaaaaca actgctggtgtggaacacgg cggggggggg ggggggtggg 1140 tcgcaccccc caaaataatt aacaccgccagaacgaagaa gctctcacgc atcatccgct 1200 gcgaaaacac gcggccttct gtgggcgtacttagatgcag gcgggcgtgg tttttctccc 1260 ccacgaagtg gtgatgtgtg ctcccccccgaggggggagg gagtaattat aaacaccccc 1320 ctctctgtgg gggtgagaac acaaataattgttcgtcgta gggtgggtgt acacccacat 1380 cgtcagcaag agatctgtcc tggctgtgcgacaacccagc gtgtgtgtgg ggcgggcccc 1440 cctacaagag gatcagctcg cggtgtcgttggtataataa acaaccccac cgggggcgca 1500 gcgaggagga aaaacaaccc gtgcaggggcgtgctggcag aacaacagca gcggggaaga 1560 agattgcacc acgagtggga caaagacggacagggagcgt cgcacggcaa aatcttgctg 1620 gggcgggaaa caacaaaaca gctgcgagcagcggctggct gcgggcgtcc acaaacgatg 1680 cgtgtgcggg tgccctcctc cccccagaggtcgggggcgg cggcaacaca cagggagggc 1740 aaacaacgag cgagagtcac ccggtga 17672 541 DNA Homo sapien misc_feature (495)..(495) a, c, g or t 2gcgtggtcgc ggcgaggtac agtccagatc ttttctttaa ttcttatggt tttttttttt 60tttttttttt ttaaaaaatg gagtttgtgc aattttgcca aggttgattt tgaattccgg 120ggcccaattg atccccccac ctcagcctcc tgagtggtgg ggtttacggg ggtaacccat 180tgtgcctggt ttccagcttt ccttttaaat tagggggtta tagttcggca caaccaggac 240ccagggcagg aaatatacac ttccccaata gcaaattagc attaccgtga cctcctctgt 300gctaatatgg cacttttgtt aaccaagtga attgatgggt gtggagtggt gtggatgtag 360atgaagtgaa ttgaaacata tactacgtga taatttatat cccagagtcc tcaaaaatat 420tggtggcgtt gaaaaattgg ggagggcggg agtggaaatt cactgttgga tatagattaa 480ccacggtgaa attantggct tgcttgaaaa ggtcttaaag taagtggtgt tttttactca 540 g541 3 874 DNA Homo sapien misc_feature (770)..(770) a, c, g or t 3ctgagtttca gtcctggctc tacccttctt ggcctgtggt tctaagcatg ttatttgcct 60ctctcagctt caactgtgaa gagttcaatt aggtgatcac tttaactttt ctagctcgga 120tactctgtgc cagctctgga accatgcttt ttggtgtctg tgtgtatata taggtcacct 180gtatgtattt aggtcctttg aggaatctac tggacgtatc aaaaaaaaaa aaaaaacacc 240cacaaaaaga acagccccgt ggagctcttg agtgtgggtc tccacttagt gttgtgttgt 300gtttctcccc aatctctttc ttagaagcca gggaggggca cccttctgtg gggtcttcca 360ccattcttct tgaggcgagc cattccccag ccttccttct tcttcccaag cctgtgttct 420tgttacactt gggtgaaggg gggaagtgtg ttcccgggct ggagaactgg tgtttaacag 480gtaaggtctc tggccctccc aggtgactct ttttaggggg caggacccca ttcttggtaa 540gcccagcatt ggctctggcc ccagacactt tgtggtttgg tctcaggtaa tcggtggctg 600tccactaggc tgcttgttgg acctttcttg cgtggtgtcc atattggtct tcctttgtgc 660ggaaaattaa ttcctttcgc acttgccaca aaaaacccaa aacacaaaaa aggcgtgggg 720cgcccgtggc ctaagcgggt ccgtgggaga aatggttccg ccccacaacn accgccacac 780accacacaca gcgcgggcgg gggggcgctt aaaacagaac gaagggggac gacaggcaca 840caaggcagga ggaacagaga aaaaggggag agtg 874 4 557 DNA Homo sapienmisc_feature (404)..(486) a, c, g or t 4 gcatataatg tatattgttgaatccaatca ggatgcatgt gagatgatat attggctgaa 60 gctaccattt taccgctgtggctccctgag actcttgatt ctagcttctg tgtctgcgaa 120 cgtgataact ggaggaatactatcatagga atggtatata cgcatattga ggcacaaagt 180 tggagtgaat gaaagcgtactgattggagt tagaccagta gcactgaaca tagtgagtgc 240 acgagtacat ctataccccaacaaatagtc gatcactaca tcctggaagc ataccagcac 300 ccaagcaaca acaagacattaggctactag caatggggtt atattacaat taccttgact 360 agacacataa aagaacaatttcagagccca catgatttta gtannnnnnn nnnnnnnnnn 420 nnnnnnnnnn nnnnnnnnnnnnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480 nnnnnnatnc ccccacccaccaccccaccc caacccccct cccccacccc ccaaccccca 540 atagcccccc cccacaa 557 5504 DNA Homo sapien 5 atgtctatgg ggactggtgt ctcctagatg ctgctcagcggccgccaggt tgtgatggat 60 gcgtgtcgcg gccgaggtac ctgagatatg gtcagattctaaatacattt tagtggcaca 120 atctacaaaa cttaatgact caccagacat gggaaatgaaagaagcagag tcctgagata 180 acctaaagtt cttggcctga gcagctggaa gactggagtggccatttact gagacagaga 240 agctatgaga agaaccattt tgggggagaa gagaacatactgcgttggag aagtctatta 300 gatccggttg aagatgttga gtagctattt ggatatgtagcttttctcac agttccccaa 360 aactttacga tttgcctacc gactgagcca acagctaaatgtgtgccctg tttttaattc 420 tatgtgtagt ttgctgtaga aagagaaagc aactcttaaaacctgaaaag aaatgaaaat 480 ggaaactaat gaattacatt aaga 504 6 795 DNA Homosapien 6 tttttttttt tttatttttt tttgggaaaa aacagaaccc ccccaaaaacattaattttt 60 tttttaaaaa aatctgggcc cccttgggtg gtttccaatt tggtttccccccttttcccc 120 ttgaacccaa attcctaaaa cttgtttttc ttaaaaaatg agttgtggctacctttaacc 180 cataccctta actcgggtgg tgtcccacat agttgctccc accccagtacccagctctct 240 cctccaccct ttctctgcgg gtttccagtg ctcctcaggg ccgtgagcagcacgtgaggg 300 gctgggacga ttttttctcc tttaacgaat gtccagctct ccagccaagtttggagagcc 360 ttctctctca tgtgttggct caaacgcaac cgagaacgcg tgtttctcttgcgggttcga 420 taggccaata ggaggaccag attgttggat ttatttttct tcccgagtgtattatcccgg 480 actctaacca ctggtccaga ggcttgggtg cccaccaaca ctatatatctcctcgggggg 540 accttttcgt gggccctatc gagggagaac agcggttgtc tgtgcccagttggttccctt 600 aaacccccgt atggggggga gggacaaaac gtggtctcct cgcccaattttcggggatgt 660 ctccttttat tttccagcaa ccactttttc ttcaaaaagc tgggggggtaacctggggcc 720 ataggcctgg tcccccgtgg tgtaatttgg tcttcccgtt ccaatttcccccctactcac 780 agcacacccc accta 795 7 260 DNA Homo sapien 7 gccgggcaggtaccttatat tagttttctt atttattttc acagcatcct ttttctatgt 60 agcaatgagttgcttttttt ttgccttttt aaagatggaa gtcacagcaa aatgggaaat 120 taacttgcctattaattcat gcaacatgac aactgcagag caatgtctag agtaagacaa 180 tagtatgtcttattcttctt cagaaaatat tcttatatgt catatttagt taaaatatca 240 tgtatcatatcatatgttta 260 8 609 DNA Homo sapien 8 gcgatgttca tcaactatag gcgaatggtccctagatgca tgccgagcgg cgcaggttgt 60 gatggatcgg cgcccgggca ggtacattgttttttttttt tttttttttt ttttttgaaa 120 aaaaccccgg ttttaatacc ttattttttttggctttaaa aaaatttttt aaccatttta 180 aaaaaacccc ccctttcccc catttcagtttccccgttaa acgggtttaa aagttgaggc 240 aaagtgaatt tttgtctcca ccgagctttgggaccactca gcggttccgt gtgcaaagga 300 ccttctcgag acaccaaccc cctttgtgccaaaaaaattc gtggacagct ttttacactt 360 gttggtctta taaacaaata ccagacgcgggatattctcc cccccccctc gtagatgtgg 420 gacaaacccg ccttgtctca ccagccaaatctttctctct ccacccaaac acgagagctg 480 tggggggtat acatctcgag tggtctccaatagcgctgtg ttccacgcgt ggtgtgtaga 540 aatgtgtgtt tctctcgcgc ctctcaacatatctcccacc aaaaaattag cacaacacaa 600 aatggaatg 609 9 450 DNA Homo sapien9 actaatcatt atttctttct tttttttttt tttggggagg gagctcttgc tctgtcaccc 60aggcgggaat tgtcgggggt gcaatcttgg gctcacgtgg aacctcctcc tcttggggtt 120caaggtgatt ctccgtggtg cctcagccct cccgagttgg ggggcccccg ggtgcccgtt 180accagtgccc gggttaattt ctgggtatat ttaaggtaga agaacgaggg ttctcaccat 240tgttgggcca ggcgggtctc aaactccgtg gacttcaagt gatctgccca tctgggactc 300ccaaagggcg gtgggattac gaggcttgag ccaccatatg cggccgattt tataatgata 360ctctaaataa cacttttcct acactgggat ttgcccaaag atcattgggt gaacccttcc 420cacccttgtt tttgtgaagc aaacggaact 450 10 238 DNA Homo sapien 10atccttatta gatatgtaat ttggcacata ttttctccca ttttgtgggt tgtctttgtc 60ttttttattt tttattgctt gtgtttggta tcattattga taattcattg ttaaattaaa 120ggtcatgaac gagttaccaa aaaaacacaa cagaaaaaaa aaaaagcctg ggggaaaacc 180aggccaaacc gtttccccgg ggggaaattg tttccgccac attcaataaa acaaaaac 238 111925 DNA Homo sapien 11 tttttttttg aatgtttata acagctttat taatattggccaaaacttgg aagcaaccaa 60 gatgtccctc tataggtgca tagataaaca ttttatggcccatccataaa atgaaacatt 120 attcagcaat aaaaggaaat gaggtataaa gccatgaagagatatggggg aaatttaaat 180 tcatattgct aagtgagaga agccagtttg ttagtttattttataaatca ggatatggtt 240 tattttggtg aatattccat gtgtacttgc aaaaattgtgaattctacca cttttggtta 300 tagtggtcta taaatgtcca ttaggacaag tttatcctagtgttgttcag atctatcctt 360 gttaactttt tagctaattt atttagctaa aattaattttttagctaact tttattaatt 420 attaagagta aagcatttaa atcccaaata taattgtggatttgtcaatt tccccttgca 480 gttatgtcaa tttttacttc cggtattgtt atgtctttatcagggggcaa ttcaccaagc 540 agtgccctct ttctctctga taatattgct tcttttgaagtccacttttg tttatattaa 600 tattgtcatt ccaacttttt tttgactaga gtttgcacaatatattatcc ctttcttttt 660 atttacagtg agttacgggt agaaagcata tatttgggtctctaggaaga attgaatatt 720 taatacgtgg gacaatataa gatttctatt ttttcttgagtaggttttga taatttatat 780 tttctaggaa tttgtctatt ttctctaaac tttcaaaccctattggcata aattgttaac 840 actgtccctt aatcttttta atctttatgg tgtttttcaatatgctcccc cttttctttc 900 ataatattat ttctatatac ttttcttttt gtcttgattaatcggccaaa tgtttgtcta 960 ttttattaca aaactaaaat aaccaaaaca ggctggtactggcataaaga taaacattta 1020 gaccaataga atagaattga gactgcagaa gtaaactcatacatatatct tcaattgaat 1080 ttctacaagg gtgtcaggac cataccatca gaaaataatatttttcaaca aatcactttg 1140 ggtcaattgc atagatacat gcaaaacaat gaagctggactcctaacaca tactatatta 1200 aaaattaact tcaattgatt acagacatga atacaagagctaaaactaaa aattatagaa 1260 gaaaaagtaa gagtaactct tcatgacctt taatttaacaatgaattatc aataatgata 1320 ccaaaacaca agcaataaaa aataaaaaag acaaaagacaacccacaaaa tgggagaaaa 1380 tatgcgccaa attacatatc taataaggat ctgttatccagattataact cttacaacct 1440 tacaccaaga cataactcat ttagaaactg ccaaaagacttgaatagaca tttcttcaaa 1500 gaagatatac tagtggccac agaagcacat gaaaagatgctcaatatcat taatcattta 1560 gggaaatgca aaataaaatc acccatgaga taccactttacacctactag gatggctata 1620 atcaaaaaga aaacagaaaa taataaaggg gttctcaggatgtggagaaa tttgtaactt 1680 catacactgc tggtaggaat atacaatggc acagccaccatggagaacag ttgggaagtt 1740 cttcaaaaag ttgaacagaa ttacaatatg acccagcaaattccactcct agatatatac 1800 ccacagaaaa ataaaacttg tgtccactaa aaccttgtacacaaatgttc acagcaatat 1860 tattcataat agctaaaaaa aaagtagaaa caaccaatcaatggataaat ggataaacaa 1920 atgtc 1925 12 408 DNA Homo sapien 12ggttcgcggc cgaggtctgg gatgaagaat tctgaagcgg gaccactacc aggtcctgcc 60ccctgcccac agccctccca ggccgggcag ggcaagttct ggagggccgg tgggggcata 120cactgaaggc tgtgtgacgt ttctatttct caaggcagta acagcaaccg tgaacctcag 180aggcagccaa gggaaatgtt cctcccatat ggaaagtcag aagctgccag agaggcaagt 240ggagcatgca agacaactga tggcatagtc tcagaactga ccatgaatac ttgctctcca 300ctttccattg accaaagcaa gtccaacgtt gtgggaaagg gtccctcacc cacagtggga 360ggtgaggggt gtggacactt gccacttgct gattgatgac caaaatat 408 13 525 DNA Homosapien 13 tgctctcggc tgcgctcttc ttgctggcgt aatggttcac gttgatgggcccgtgcctcc 60 agccccagcg ggccggcaga gggccatgcc aggaccttgc aacaaacacaaggtgctcag 120 taagtgctga gtcctgggat gaagaattct gaagtgggac cactaccagggcctgccccc 180 tgcccacagc cctcccaggc cgggcagggc aagttctgga gggccggtgggggcatacac 240 tgaaggctgt gtgacgtttc tatttctcaa ggcagtaaca gcaaccgtgaacctcagagg 300 cagccaaggg aaatgttcct cccatatgga aagtcagaag ctgccagagaggcaagtgga 360 gcatgcaaga caactgatgg catagtctca gaactgacca tgaatacttgctctccactt 420 tccattgacc aaagcaagtc caacgttgtg ggaaagggtc cctcacccacagtgggaggt 480 gaggggtgtg gacacttgcc acttgctgat tgatgaccaa aatat 525 14504 DNA Homo sapien 14 ggtttcccac caatttatag ccaggagtgg ctctattttctctcgtgttg ctccagttca 60 ctaatttata gttctcgtgc gaagaacttt gtagctcagaaacaaaagat agagcaaaaa 120 agagctctct cagggttagg acgtgccaca catataggacatttaaatgg ccatcttctt 180 aataattcct ggggacatta aaactcaaat ctctggttggaaaatttgaa aagtttgtaa 240 accttatgtt cggcaaacgg taatagaaaa tatgttgatgatgaacatat ttggtttcca 300 tacaaactgt gcttccccat tctaaataga tgctagtttctctattcctc ctgggctggt 360 aaataaaagt ggccccaaat aaaaaaaaaa aaaaaacaaacaaacaaaca aaaaaggtcg 420 ggggcggaac ccctgggcaa agcgtgcccc cggggggaaaatttggtttc ccggcacaat 480 cccaaaatca agacaacaaa aggg 504 15 694 DNA Homosapien 15 cgtggtcgcg gcgaggtccc cccttttttt tttttttttt ttttttttttttggggtttt 60 cccttttttt tgggttttta aaatttcccc tcaaaaaata aaaatttcccggggggggtt 120 tcctaacacc cgggggtttt tttttatccc tcagggcctg ggggtttaaaaatttaaaaa 180 gccttgagat tttttttaaa caaaattgtg attattggcg ccagggcagggttgcgctac 240 aggcgctggt atccccacgc atttgtgaat gccccacacg gcgggttgtgaatgcgcctg 300 agtgctcagg gaattattag acgacgcggt ggtgcgtcat aattttgtagaacccccggc 360 ttcatcttaa aatataccaa aaaaatttac gccggggggg ggtgcggtgccgccttaatc 420 ccccggttat atcatctcca ggggaggcaa ggaaggcgtg cgacgggaagaatggcgctg 480 tagacgcctg tgggagggtg gaaacgtatg acagagcccc ccacaatttgctcccaattg 540 tgccccccca gccggtggca gaaaagacag gaaaccccct tcttcaaaaaaaaaaagggg 600 agagcgttgc ggcgtactac tgggccagaa gatggccccc gggtgaaaaatgttctcccc 660 gccccaaccc ccataacctg aaaaaaaaaa gtcg 694 16 988 DNA Homosapien 16 accaacaaac aaccacaaca ccaccacccc aaccaccggt gatagatcactatggggcca 60 tggtgcctct agatgctgct cgagcggcgc agtgtgatgg attggtcgcggcgaggtaca 120 ttaaaaaata tgacctcaat tttttaagtg tttaggatac aatgtaaattacatataaat 180 caaagctctg ttttccttgc acacaccctg ggtgagagac cgccgctcccggaggctctt 240 cgtcctctgc agaacacacc tgggggtggt gaaaggtgtc ggctgaagcatggagcacgt 300 cctccgggct ccccagtgac cttgggcact gccccccaac agagcttcaggcccctcccc 360 cactatggcc ccgaggatgc ccctcccagc ctgtctgagg agtcatgccaagtccctggc 420 acccagggtt aaattccttc atttgagcac acgtccggcg gcccttcattgtaagctctc 480 agtaaacggt tccccggaaa ttaaaataca aaaattctcc aacttcaatccatgaaatga 540 attataatta gagaaaaata aaatatgttt tagttttaat tttctataatcttaaaaaat 600 atttatgtat ctatctttta tgtctccgag aaggcacaca cagaaagtaaaaagcccagg 660 gcgggggctg cgcagcctgc cctcaggcct tcctccagca agggaggctccccagtgcgg 720 ccgcccgctt cccaggccaa ctcccagact gtgtccagtc cccaccctggcagtctgggc 780 aacaccaagc gagcttcttg aagccactaa cactcaagtc tcatactcaacatcaacaga 840 ccccggcctc atgggattgt acattaaata gacatactcg aatgcatggttgttatgctt 900 aaaaataagc taaagctggg tatctgtcaa gctgtctggt gaatgttcgccccccaacaa 960 aaaaaaaaaa attatataaa aaaaaata 988 17 221 DNA Homo sapien17 cagcggcgcc cgggcaggta ccgggcagct tgacctccat tgcttttggc ttttgtctct 60ttctcctttt gaagctcaaa agggcataga gtggactctg atcctaggat ttttttttcc 120ctgctttggc tgcctctgtt ttggttcatg tgtcaagcag agacggggaa agccaaacga 180cacaatgagc gttctcagaa aggaaacttc ttcggaatga a 221 18 765 DNA Homo sapien18 actagtcaca tatttattca aaagcaattt acaaagcttt ctatatcttt tcaacatatc 60cacctgcccc tccagcatct ttttggcatt caagtaacca ctatgattta ccccgctaag 120aaaaattatt cacatttcca tggcacaaat tgtaggaaag gaaaagacat tcttattcaa 180gcagcggaag ggttttggtg aaaaaacagg ttctggttct ggggaggttt ttgttatgtt 240aggtgatcgc ctctgaagtt tgcatcatca tagagctctt aacgttaacc agggctttgg 300cctcaggagg gaagccttat ctgtagcaga gatcagttgc aggaaacagc cccaaatctc 360aaacagtggc acccagactt gatggccgac caggcacaga tgtaagccat aaaaaagtac 420tcatttgctt gctggctaca agaaggagca ttttatctag tgagtccatc aggaggtcag 480gcgtaaagaa acatgtaccg ggcagcttga cctccattgc ttttggcttt tgtctctttc 540tccttttgaa gctcaaaagg gcatagagtg gactctgatc ctaggatttt tttttccctg 600ctttggctgc ctctgttttg gttcatgtgt caagcagaga cggggaaagc caaacgacac 660aatgagcgtt ctcagaaagg aaacttcttc ggaatgaaaa gctttggcca cattcgaaag 720ggtagaagtc tgagagaaac tttctcatca gggagactag gtcgg 765 19 408 DNA Homosapien misc_feature (268)..(268) a, c, g or t 19 gaccaactgt gctccatctccacgaggttg tgaagagaga aaatgggccg cctgcactac 60 agcatgagag ccatcagttagacaaaaaga agcatggtga gacaggcaag gccctccaga 120 gaaagccagg aaggcagtgagtggctttca aaaccgatgt ggtgcattca gaggctggaa 180 ggatggacaa tattactttcccagaaagtt tcgcaaaact ttctcttttg ttgacatgtt 240 gaaaatagca agccattgccgttccggntt tcccccccgg gtcccggcct gtgcgtctgc 300 tggcaagcat gttaatttccagaactcaca gaattaaagc cagagaggat ccttgtaact 360 catcttctct ccctccccagcctcccacag aaccataccc aaaagctt 408 20 1154 DNA Homo sapien misc_feature(1014)..(1014) a, c, g or t 20 atggtcaagc aaggcaggaa tgctcaggccaatcggttct ggctcaaatc cattccagag 60 gggagtgatt tacaacttag caccatcttcgtctcttccc tgagagtgaa gttaaatgac 120 ccaagaataa cattagtctg caccagacctcgcgagaatg atcttcctaa gggtggtcct 180 gggcatgggt tttcacctgc agaaactcaagcccagacat ccctccaaag ccctgtttta 240 ctaaagcatt ttaaatcttg tgggacagatgggaaaataa aacttgctgt tggaaccttg 300 ggatttataa gaaactcctt ggtcaacattataaggagga cggaatcttc caagctaatt 360 cttaacaatg caaagagtgt gatgtgttttggacacacac acctactcca aaatgcaagc 420 caatgtatgc atctgtgcag aggcagacacctacgcccat cagcacagca catgtgcaca 480 ccgcccttcc ctagacgctc aagtggtgagtataaggcag aggctgctga gtcttgggca 540 gaagcagcag gaggaggccc cagccatcccctggcctctg gagcagaaac atggggagcc 600 tttgaagttg ccacaactca ggtggaagcccctcagagca gccctaagag gaagtcattc 660 attttcaaac aaaggggaaa attcctcaggaccctggtgt ttccactaaa acaatcctct 720 gaaaagttat tccccagagg agcaaggaccagctgtgctc catctccacg aggttgtgaa 780 gagagaaaat gggccgcctg cactacagcatgagagccat cagttagaca aaaagaagca 840 tggtgagaca ggcaaggccc tccagagaaagccaggaagg cagtgagtgg ctttcaaaac 900 cgatgtggtg cattcagagg ctggaaggatggacaatatt actttcccag aaagtttcgc 960 aaaactttct cttttgttga catgttgaaaatagcaagcc attgccgttc cggntttccc 1020 ccccgggtcc cggcctgtgc gtctgctggcaagcatgtta atttccagaa ctcacagaat 1080 taaagccaga gaggatcctt gtaactcatcttctctccct ccccagcctc ccacagaacc 1140 atacccaaaa gctt 1154 21 735 DNAHomo sapien 21 gatgattgcc atataggcga atgggcctct aaatgccatg ctcgagcggcgcagtgtgat 60 ggatcggccg cccgtgcagg tccccccccc tttttttttt ttttttttttttttttttgt 120 tttaaaaaaa ttgactttgc ttttttactt tgggcggtgg ggccctgcttgaggtggtag 180 tgtgcccagg ggatgggtgg cctgtggaaa taataccaaa agtgtgtctgaaaggaagag 240 ggtgttgttt ttgaaggccg ggcccagggt gccctcaagt gccccgttatcttgagaaag 300 ggagacacgc cttgagagaa agagaattaa tgggaaacgc catacgttaggcgccaccaa 360 ttacatgata taaaaaattc ttggaaaaat ctatgctgac catcactggtggggtccaca 420 gtttctcaca tcatggcggt caatggaccc cgggtccctc tctggtgtccttgtgggaga 480 aggcgcagga tatgtcctgt gattcacatg agaagctggg gactgaaaattcatgggcca 540 ttacgcttgt tccctggtgt tgaaaatgag gtgtcatccc cgctccacaatttcccccac 600 aaatattatg cgaaaaacaa tcggccccca ttttgtggcg acgcccaacggtgagcaacc 660 gcaaggaaca aaaccgatac atgcaactga caaaaacaac cattcatgaacacacaaatg 720 aacaaaatca agagt 735 22 218 DNA Homo sapien 22 catttaggcctcgtgctcta gatgctgtcg agcggcgcag ttgtgatgat cgagcggcgg 60 cccgggcaggtactagctct gaaaaccatt acgaagcaat gaactcatct gcaaataaaa 120 agcacatatctttaatttct aatgttttat tatagatttt taaagataca tatttatttt 180 tatattattagcttaaagaa agtaagtcac acaagaat 218 23 4779 DNA Homo sapien 23 gacactataaatgtctttcc ttatctgtgt gtactcttat ctcactgttc tattttttct 60 cctcatttatattaactctt tcttaccttt ttttctgaac ttctaggcct tctctttcca 120 gaactggtggaagacaaatg aaacggccaa gatggtaaga aacaagccgc atttctcctt 180 ggggagactgataatttaaa aggtttgttg tgtcagaaac attcccagct tcatcaccaa 240 ccctttccttccacctctgc ccactggaga ccacttacat cccgaagcgg acgcggcagc 300 tgaagtcaggaaaccatgca tcacattagc aggagccaac tgcagacttt aaactccgtt 360 caacatgtggatgcggcaga gaaatgacct gtccagacaa gccggggcag ctcataaact 420 ggttcatctgctccctgtgc gtcccgcggg tgcgtaagct ctggagcagc cggcgtccaa 480 ggacccggagaaaccttctg ctgggcactg cgtgtgccat ctacttgggc ttcctggtga 540 gccaggtggggagggcctct ctccagcatg gacaggcggc tgagaagggg ccacatcgca 600 gccgcgacaccgccgagcca tccttccctg agatacccct ggatggtacc ctggcccctc 660 cagagtcccagggcaatggg tccactctgc agcccaatgt ggtgtacatt accctacgct 720 ccaagcgcagcaagccggcc aatatccgtg gcaccgtgaa gcccaagcgc aggaaaaagc 780 atgcagtggcatcggctgcc ccagggcagg aggctttggt cggaccatcc cttcagccgc 840 aggaagcggcaagggaagct gatgctgtag cacctgggta cgctcaggga gcaaacctgg 900 ttaagattggagagcgaccc tggaggttgg tgcggggtcc gggagtgcga gccgggggcc 960 cagacttcctgcagcccagc tccagggaga gcaacattag gatctacagc gagagcgccc 1020 cctcctggctgagcaaagat gacatccgaa gaatgcgact cttggcggac agcgcagtgg 1080 cagggctccggcctgtgtcc tctaggagcg gagcccgttt gctggtgctg gaggggggcg 1140 cacctggcgctgtgctccgc tgtggcccta gcccctgtgg gcttctcaag cagcccttgg 1200 acatgagtgaggtgtttgcc ttccacctag acaggatcct ggggctcaac aggaccctgc 1260 cgtctgtgagcaggaaagca gagttcatcc aagatggccg cccatgcccc atcattcttt 1320 gggatgcatctttatcttca gcaagtaatg acacccattc ttctgttaag ctcacctggg 1380 gaacttatcagcagttgctg aaacagaaat gctggcagaa tggccgagta cccaagcctg 1440 aatcaggttgtactgaaata catcatcatg agtggtccaa gatggcactc tttgattttt 1500 tgttacagatttataatcgc ttagatacaa attgctgtgg attcagacct cgcaaggaag 1560 atgcctgtgtacagaatgga ttgaggccaa aatgtgatga ccaaggttct gcggctctag 1620 cacacattatccagcgaaag catgacccaa ggcatttggt ttttatagac aacaagggtt 1680 tctttgacaggagtgaagat aacttaaact tcaaattgtt agaaggcatc aaagagtttc 1740 cagcttctgcagtttctgtt ttgaagagcc agcacttacg gcagaaactt cttcagtctc 1800 tgtttcttgataaagtgtat tgggaaagtc aaggaggtag acaaggaatt gacaagctta 1860 tcgatgtaatagaacacaga gccaaaattc ttatcaccta tatcaatgca cacggggtca 1920 aagtattacctatgaatgaa tgacaaaaga atcttctggc tagggtgtta gatatattta 1980 tgcatttttggttttgtttt taaatcaagc acatcaacct caagcccgtt tagcaatgag 2040 gcagtgtagatgaatacgta aaataaatga ctttaaccaa gtagctataa tgggacttag 2100 cactgtatgcatacttaaaa aggttttgaa aaacaaacta cttgagaaat atttgtttat 2160 atttttctctaacatcatgc tatgtgtcag tctgaacatc tgacaacaga aatttcagtt 2220 attattctagctaagttttg aaaacatttg tcatgctgtt taatagaaaa ctgcaaacca 2280 gagatactgactccattaat aaaccatatt ttgtgccgtt ttgactgttc tgaccaaata 2340 ctaatgggaacaattcttga cgtttttctg ttgctgattg ttaacataga gcagtctcta 2400 cactaccctgaggcaactct acattggaac actgaggctt acagcctgca agagcatcag 2460 agctgaccatacatttaaac agaaatgctg gtttatttgc aaaatcacca gtatattttc 2520 tattgtgtctataaaaaatc agtcatttaa gtacaagaat catattttcc attccttttt 2580 agaaatttattttgttgtcc ctatggaaat cattcacatc tgacaattta tatgttaaag 2640 agttttactctctctatttt ggtccaattt gtatctagtg gctgagaaat taaataattc 2700 taaagtatgaagttacctat ctgaaaatgt acttacagag tatcatttta aaatggatgt 2760 ctctttaaaaattttgttac ttttaccaac aatgtaatat aatttatgta tattttatta 2820 ataatagtgaattccttaaa atttgttcta tgtacttata tttaatttga tttaatggtt 2880 actgcccagatattgagaaa tggttcaaat attgagtgtg tttcaatata ttatctggct 2940 tatttcaacatgagtaatat gagcaaaata agttaaaacc tgcgtctgat caattttcct 3000 catgactagaactaaaacag taaatttgga caatattaag cctcaaataa tcatctccaa 3060 actccttctaacacttttta aatcagattg gaagacatgg acaaatcagg ttcatgtgtt 3120 gcatctttatgtcctttgcc aatatccaag atcatcacat atggtagata ttcacatgga 3180 gtttcaaattcagaatagat taccattacc ttcctgccct tacacatcct actccttatt 3240 taaaagttctatttgtgact tttcatttcc tgaaagttta aaaatacaat ttgagaatgt 3300 ttataatacattctctcctg tcttttcacg gttacgtctg ttattgctga aatacaccac 3360 attttctttgttctggtcaa ggttaactca atatctgtgt gaaagagaac tactaacaac 3420 gttacaatagaggctagatt tgaaaaaaaa aatctataga tctaattgat acaattgtag 3480 aacaaaatgtcaaaataatg ttttaagtat aagagaagat ggaccaagga gagagagatc 3540 atttgaaaatctaattgtag cttttctagg ctcacattca tgtactactt ttagcaccct 3600 tatgggctgtgctcgccccc tggacagttg agctttggat tatcttcctc ttcaattttc 3660 cctctattgacccgagtgtc tccctctgct tctacagatt tatagtactc cttggctctt 3720 ttgagtctccacttttactc actgtctctg ggatttttaa gatccttttc ttctcttata 3780 aatcatcctcttaatgaaaa ttagcctaac aaaagtttgg agactggaat cctactttga 3840 gccactgacttgaaataact cttttggcaa gttgcctgac atcctgtctt accaaggtgg 3900 catatttgcatttttactgc ttaaaacatt tttttttttt taccatcttt atccaaattt 3960 atcatattgatggtaggact aacaggcttt ttagaagctg gctttaactt tgagtctcaa 4020 gctacaatgctgttgggcag cctggtcttc ccacgtgagg gtttaacttt gtttatttgc 4080 ctccagttattccaaaatgc ttattaaatg aaaggcccag gaacatgttt attttagtca 4140 cctttgctttttaacaattt tgttttgtaa tcaatgagta attcatgatg aattattttt 4200 gactaatggatagccgaagg ccaagctttt aattctaata ggtaatgttc ttcttttgtc 4260 ttattgaaacaatgagaata ctctgtgcat ttcaaatgca ctccgattat gctgtggttt 4320 tattcacataagcacaatat gtgttttatt tataacttca taacaaactt ataatataat 4380 aatttaccttagcagacatg caaaagctta ttcttgtgtg acttactttc tttaagctaa 4440 taatataaaaataaatatgt atcttaaaaa tctataataa aacattagaa attaaagata 4500 tgtgctttttattttgcaga tgagttcatt tgcttctgta gatgtgtttt cagagctagg 4560 tacagaggaatgtttgctac ctttagcggt gaaaaaagaa agagagtcaa gaattttgtt 4620 ggattgtgtttgtgtgtgca tatatttgat atcatcatta tatttgtaat ctttggactt 4680 gtaatcatagcctgtttatt ctactgtgcc attaaatata ctttacctta tacataacga 4740 ataaaatacctagatgtaga tttatttaca aaaaaaaaa 4779 24 1173 DNA Homo sapien 24ttacgccaag cttatatcgt gaaatccaag agaaaggaaa aataaaaatg tttaatgcat 60tttgaagcta agttgtccaa ctacattatg tctattgtcc caaatacctt ttcctgcact 120atgtattgat tcattgtcaa ttcattttaa tacctaagcc cttacactac attacaatag 180ttaacctttt catcaatatt atccaatgct tggcacagaa tagaacacta agcagaaagt 240aaccattgtc attattacta ttgggactat cattatacat gtaaaaagat tcttcctgtg 300ttcaaactgt agacaagatt gaatgacaag aggttgtctt tacaccaata tttaatattt 360gagtctccag agtcaccata ttacaaccag ggagaattaa aacatgatat gaaattgctc 420tagtaatgaa tttatcacct ataaaatacc cataaaacat aactttgtta ttgacagtaa 480cttctgattt atccctgccc attatctaat atctttttga ttgtcctaac tgatagtcaa 540catctagcaa tacaatgcaa gtacagtcaa tgtaaataga ttgcaaagcc gaagtgcaaa 600tctttccaaa agcatgggct ttcataaaat cagtttgggt gatttcagag aactgcttca 660attataggca aaggaactca cagaagaaaa ctagttaaca aatgagttgg ataaaggaag 720acgatggaca cttaaatata tttggattaa agggttagaa aagagtcact gtcaaaaatt 780catgaagttc ttgactattc ttttgtaaac aggaccctct ttgtgatgtt aatgttcaag 840tcaattgtga agagtaaggt ctgtaaagct gtcacacaat tttgtagaaa aaattaacca 900tttcctccaa aaaattaaca ttttattcat tttttattct aagatttagt gaagttgcta 960ttatgctatt atgaactaca tttggataaa tataaaataa acttactctc ataatttata 1020gctacagctt ttcatctatt cataataaaa ttttgatcac attttagtag ggtgtaaggc 1080cttactttaa gagaacaagt aattttacga taatgaagat ctctagtatg ttaaatgatg 1140gtgctgctgg gcatggtggc tcaagcctgt aac 1173 25 1301 DNA Homo sapienmisc_feature (520)..(520) a, c, g or t 25 ggtgcttttt tttttttttttttttcactg gtttaaaaaa agtgacgtgg tcttttttac 60 tatggggggg gggggccggcttgagggggg taggtgggtg cccaggggaa gtgggggggg 120 cgtgggagaa gaatgatgtgaccagagaaa gggcgtggaa cggaaagggg gggtgggtat 180 gagaagggcc aggggccagaggggctccct cagggctccg ctgtcgggag aagggcagca 240 gcctttggag gagagggagtctcagtggcc aagcccatat acctatgagg ccaccaatct 300 cagaattata agaaaattctcgtggagaaa aatctcttag cgcgtgggcc cactcagcgt 360 ggtgtgggtg ctcccagtttctcttctaca ctcagtgcgc ggtctccaga gtgagaccac 420 ccgaaggggg ctccttctgcgtgtgggtct cttgtgtgga tgaaaaaagg gcgagcctat 480 agattctgcg tgggatattctcaagtgaag agccttttgn gggcgtatac actcgagtgt 540 gtgctcaaat agcgcgtgtgtatcccggtg gtgtgtgaaa aagctgtggg tatatctccc 600 ggcgctctcc aaaacaaaatttctcccaca cacacaaaaa catatagccg gggagggccc 660 ccaaaggggg gaaggaaacaacaagcaagg ggcgagagaa aaaaaaggag agggagacaa 720 acacaagacc aaaccaaaagaacaacagac acaaaacagc aaaaaaaaaa aaggtggaga 780 caaacaacaa acaccaaccaaaccacacag caaaggcaag gacacacccg aaccagacaa 840 aacaatacga aacaacaacaccaccaactc acaacaaaaa accagcacga aacgaaacac 900 acaaacacaa acacaaaaaaaacaaacaga aacaaacaaa cacaaaacca gaacgacaaa 960 aaacaacaca acacaagggaaagacaaacg acaaaacaac acacgacaaa ccaggcccac 1020 agactaagta catatcgcaggccagcgcag acgatcaaac acaacacaca acacacacag 1080 caaaaacgag caccaccagaagacagacca ccacacacca accgaggcag aagccaacga 1140 acacagcaaa gcaacaacagtcacagcaga ctcacaaaga acagaagcat acagacacca 1200 acagacaaca accagccacgacacaacaga gcagcacaaa acacaccaag aaggaacaca 1260 cacatcggcc caaccacacgcaacaacaac acacaccaaa c 1301 26 694 DNA Homo sapien 26 tggttgcacgcagccaagca cttccatcct cattgctggt tagggttgca ctgtgtgaag 60 aatacagctgattgccccgg atgcagctta tgtgcttagt gatcacccgc cagatgcagc 120 tttgaaattttgtgatcacc aaccagatgc agctttgtgc ttagtgatca cctgaccaga 180 tgcagctttgtacttagtga tcaccaattt tcatctcggg ggctcttctc tgatctgtca 240 tcctctcttcatgttaacct gctctggctt tcacggtaca gactatccct ttataaacac 300 tgaaaaccggaaaacaacac aaaaaaaaaa aaaaacacaa acccttgggc gtcaacccgg 360 ggtccccacggtgttacccc ggtgtggtct aacattgtgt acccgcccca caaatttacc 420 cccaactcatttctcaaatc acaacacaag tactccactc accaaagact caacaatcta 480 aaaaacacataaatcacaat atcacagaag aggcgcctga gagaggaggc acggggagcg 540 gaggtcggagagagagcaag acgcgcgata cggagaagga ggagggccgg gtcggggcga 600 ggttaaggcacagaggaagg aggtgaacgg agaaggagtg gagacgcagc gctgagcagg 660 agagacgaaggggcgaggat aggaagggca ccgg 694 27 820 DNA Homo sapien 27 cgaggtctcccttttttttt tttttttttt ttttttccat ttttaaaaaa agtgacttgg 60 cttaattactatgggcgggg ggggcctgct taagggggta gggtccccag ggaagggggg 120 ggctggggaaataataacaa aagggcgtgg aaggaagggg ggttgtgggt ttgtgagggc 180 cggggcccagggggtcccct cagggtcctc cgctctcgtg ggaggggacc agcctttaag 240 ggagggagtctcctgtgggc aagccattag tcttggggcc cccaatctca gattaaagga 300 atttttcttgagaaaatctc tagcgtgacc acttcacgtg tgggttgctc cagttctctc 360 tcactcagtggcggctcaga ggacaccgcg ggctccctca cgtggggtct catgtgggta 420 gatggcgcagcaaagatctc gtgatattcc atgagaagct gtggggggga tacactcagt 480 gtggccacataggcgtggtc cccgtggtgg tgacaatgtg gttatctccg gcctctcaca 540 attctccaccacaacattca ggccgcgaca caaaaacgag cacccaacgg gggggggtta 600 caagaacaaacagcggagca gacgagccgc acaacaaaca catcgaaaca gaaataacga 660 agacagacaccaacaacagg gacacccaga gaacgaagca agcacaaaaa ccgaacaaag 720 aagaagcaaggaaggcacaa ccaacatcga caaccacgaa caagacaaat gggacaaaag 780 aacacagcaaacaacagacg cccacacaca accacaccca 820 28 669 DNA Homo sapien misc_feature(480)..(480) a, c, g or t 28 cgaggtcccc cccctttttt tttttttttt tttttttttttttttgcccc ggggcagggg 60 ggggggtttt tttcccatgg gggggcccgg gaaattttttccccttttaa aaaaaataca 120 attttaggtg ttttggggcc ccccccaggg ggggtttttgcaaaagggga aaggtaagac 180 aacacaagat tccgtttggg gatggtgtgt gcggcatggttgccttcagc gtgccctccg 240 tggtccgtgg acgccccctc tacacctctt ctggggccgtgtcaacctct tgtggtggaa 300 ttttcctcac ctggtgttgt cgttggtgga ccctccatgtcggtgtgggg ggggcggctg 360 agatgccctc attggatgca gccattttcc acaatttctggtctaaaaag ggaccgtgtg 420 agaaatgttg accccctggt gtgaaaaaga agaagagagacagttaaatg aggaggagan 480 gggacaagac agctctcttt tccttttggg gacgcgggggggaatagctc taagggacca 540 ctccacctgt gtgggggtgt ccttccacaa gcggggggggaagaccgggg cgcaatagga 600 tggtccgtgg gtggtagaat ttgtatcccg gcgctcaaaattccccaaca aattccaaca 660 cacaaaatg 669 29 144 DNA Homo sapien 29cgcattatga ctatatagcc caatgggtca ttagatgcat ctcgagcggc gcagtgtgat 60ggatggcgag gtcaacttga tttctctctc tggttttctc tcttactgta tatttattta 120taaaactaat tttatcctga aaat 144 30 631 DNA Homo sapien 30 gcggccgcccgggcaggtcc cccccccttt tttttttttt tttttttttt ttttgtgttt 60 aaaaaaatgtggaattttgt ttttttactt attggggggg ggggcctgat aaggggtgta 120 gtgtgtgcgcccaggagaat ggttggggtc tgtgaaaata ataaaaaaaa tgtcttgaga 180 agagaaaggggtgtggtgtg ttagaaggcc ggtggcccaa gtgggtgctc cctcagtgtc 240 tcctgctttcctgtgagaag ggaaacacgc ctttaatgag aaatgagatg ctactgtgca 300 acgccatatacgtataggtg ccaccaattc aatatttaaa aaattctctt gagaaaaatc 360 tcatagccttgacccaactc agctggggtg gtggtgctcc agtttctcct ctcactcagt 420 ggcggctcagattgaacccc cgatggtctc catctcgtcg tctctctgtg ggtgagaggc 480 acgcatagattcgtggatat tcacataatg aaagccttgg gggcggtaac actcgagtag 540 gcacaataggcgtgttctcc ctggtggtaa aaatatgttt tactccgtcc tcaacaattt 600 tccacacaaaatcaggagaa acaacaacta g 631 31 618 DNA Homo sapien 31 ggtcgcggcgaggtatccat ttgcctcaac ctcccaaagt gctcagatta taggagtgaa 60 ccactgtgcctggccaaaaa atatttttta agcagtgact taggtatcaa atataaaatg 120 aaaagtattttataaactgg actagaacat ttagtaaact tccttgtttt tattttttta 180 ttttttttgagacggtctcg ttctattaca tgggctggaa tacagtggga agatcacagc 240 tcagtgcagccttgaactcc tgggctcaag caatgttctc tcctcagctc ccaagtagct 300 gggcttgtaggcatgtgtca gcatgcctgg cttattttct ttttttcttt ttttcttttt 360 ttttttttttattttttttt ttttattttt tttttttatt aaaaagagca ggaggaggtc 420 atattatggtgtggcgccgg aggcggtggt ctctccaaac ctctggggtt ccagaggtag 480 tcttctccgccgagtgttgt gtcacaacgc gctgtcgggg gagcactcgt tggggcaaag 540 agtctgtcgcctggggtaga aatgtggttg tcgcgcgccc aaatttcgcc ccaaaaattg 600 cgagaacacacgagaatg 618 32 531 DNA Homo sapien misc_feature (258)..(258) a, c, g ort 32 ggagactgac tcatataggc caaggtccct aatcatgccg agcggcgcca ggtgatggat60 gcgtggcgcg gcgaggtgtt agcggctctg ctcctctgat tatgccttat tctttgctta 120tttcctttac tgagaaatgc ataatttata gttgcaaata aaaaattaat gcaggagatg 180tgttccccac atgtactttc ttattcacat ttatgccaaa aagagattat gttatcatat 240tgggactacg ttttatanag tcttgtcctg agtttactag tccaagctat attataagaa 300gactttagtt ctcctataac atggatcaga tatttcccaa aagatattta atgcataacg 360caaaaaaaac aaaaaaaaaa aaaaagcggg ggggaaaacc ggcgcaagag cgtgcccggg 420gggaaactgg ggtccccggg ccaaatttcc ccaaaaaatt cgcgacacaa aagtgagaaa 480aaagagcaac acacgccagc caccaaagcc accacacaac aacactaaca c 531 33 841 DNAHomo sapien 33 ggtcgcggcc gaggtccccc cccctttttt tttttttttt tttttttttttttttttttt 60 tttttttttt tttttttttt tttttttttt tggggggggg gggggggttttttttttttg 120 gggggtgggg gggggggggg gttttttttt ggggggcccg ggggcgcccaaccaccgggg 180 ggaaacaaaa aaatcatgcg cgcgccgacc cagccaccaa aagaagggaagaacaagacc 240 gaaagtgaca acaccacgcc gagacgagga aagatgagga gtgatgaaagaaagaagaag 300 gggacggcga cagaagcgag acgagcggag gaggggagga cgacaaagacccgagacacg 360 acgccacgac gaacagaccg ccgaacaaca atggagaaac acaacacagaagagaggagg 420 agcgctgata agcagatgcg atgccacaac agccgctcgc cgcccgcggaatctaatgcg 480 aggaggcaag actgaaaaag aagaagagtc accacaccac ccaccactcacaccgaacag 540 atagaaaaga cagagagaga gtcgacagag agagagagac agaaatgaggtgaggcgtcc 600 agcgcccgtg cgcggtgaga gccacaagca gagatctaca atcaatgcaagaaccattga 660 aggcggagcg cgatacaagc aggcgagcca atacgtgact catccgcggtgggtgtaagt 720 ctgagtgtcc tcgtcaaacc acgaacacca ccgccacaag atgatgaaaacgaacagtag 780 cataaacaag agacaaacca agaagaggca agcaagcaca gaagagaagcgcacgcgaac 840 c 841 34 417 DNA Homo sapien 34 ggtcgcggcg aggtacaagctttttttttt tttttttttt attttttttg gggtttggag 60 ttttttttca attttttttttccaaaatag tgacttttga aaaattttaa catcccctgt 120 tttgaatttc ccacttttcaaattgaggct ttcaccacta tattgattgg gatattaata 180 ccaacgacca tagtttttgggcatcttgac tttttcctct caaattaacc atcaacgtcc 240 tctcactgtg aatttcacgaaacgacctca ttacctcttt ttaatttttt cccgtggaac 300 tttacaaaca agcaacaacgcttgtggtga tactctcagt tgctcaatac catgtttcca 360 tgttgtaaaa ttggttactccgccactcac aattcccacc aaacaattag cgacaat 417 35 1746 DNA Homo sapien 35gcggccgccc gggcaggttt tttttttttt tttttttttt tttttggggt ttccttattt 60tcacaaagtt ttttgtgtgt gttgggccta aaagggaaca ggcgaggggg ggtgtattcc 120aattttctct ctctcttttc tcaggttggg aaactctcgt gggtgcccgg gcggaattct 180cttataagaa atatcccttt tccccccaga gattataaac caggtaagcg cattatatat 240acccacattt tctcttataa tatagagtat agtgggctcc tatcacaata tataaaacca 300caccttttca cacaaatagc gcttttagag tgtggcattc tcatctcaca cagagtatat 360atctctcgcg cacatatata tatttttata tatactctcg tggtggtgtc ccctattgtg 420tgcgttataa gaacattata acgcgcacaa gcgatatata tttatatatc tctctctcgc 480gtgtgtggca catatatgtg tggggcagat atctctctct ctcttctctc gctgtgcgca 540catatatcta cggcggggga tatatatata tctctcacgc gcgcgagggg aggagacata 600tctccgcgcg cgccttttaa tattgtgtgt gtgagacaaa gtgtggattc tctccccatt 660atatatatat actactcccg ctgctcagac acgtgtagac acagcagtag tgtgagggga 720gagacccccc ccgtgtgaga ggtgttctcc cccccacact atatgtctca gagatatatt 780tccacttttt ctcacttttc actatctaca aaagagagcc cccgggtgat atatcttcta 840tcgcgcgcgc catatatctt aatatatatg atgagagagg atactgcgcg tggggtctcc 900ccaaggtgtg tagaaccccc caagtagtgg tgggggcccc ccctaaaaaa agaggtgtcc 960cctattatat aaaccacaaa aaagcgggcg gtgggggggg aataaacacc ccgggtgggg 1020gcaccaaaaa gcgcggtgat taaccccgcg tgggggtgtg gaaacatcat gtggggcgtg 1080tctccccgcg ggcgcctacc acaaactttc cccccccaca aaatctgagt gtcaccgcgc 1140agccacagca acacacaacc acgtgtagga aacaagacac gagacacaac agcgaacgag 1200aagaagagag aaaagcaaac cgaagagaga tagaggaaac gcagaagaca gagacgactg 1260atgaagagac gcaaacgaca acaaacaaca aacacgaagg acaacaaaca caacacacac 1320acacaaatac cagagacgaa cgaaaaaaaa ccacgagaga caagcacgac caagacaaga 1380aacaagagaa acgaccacag agacacacac agcgaactag acaaaagcca aacaacaagc 1440gaaggaagaa gactaagagc acgaccgaga acgcacagaa caaacgagaa acaaaaaggt 1500aactcaccaa caagacaccc agcagacacg agagagagaa gacaaacgac agagcaaaca 1560acaacgaaca aaaagaccga gaagaacaaa atcggacaaa cacaacacaa gcagataaca 1620ccaaaaacga ccatacaaaa tcccacaaca aaaaactacc acaaccaaca accaacaaca 1680cacacaggat caagccacaa acaacacaga acacacacaa acaaagaata cgaagagaac 1740aaacgc 1746 36 740 DNA Homo sapien 36 cggccgccgg gcaggtagag acagtctctctctcttgcct agctgggagt gcagtggagt 60 gatcatagct cactgaggct tgaactcctgggctcgagca atccacctca gcctccagag 120 taggggagac tacagatgtg tgccaccatactcagctaat ttttaaactt tcgtagagac 180 agggtctccc tgtgttgccc aggctggcctcgaactcctg acctcaaaaa atcttcctgc 240 cctggcctcc caaagcactg ggattataggtgtgagccat tgcgcctggt cataaattct 300 tgttttagtt tgttggttta ttagacgatggaatctctct ctcttgacca ggctagaggg 360 ctgtggtgca gatctcagcc cactgcaacctctatctcct gagctcaagc gatcctcctt 420 agcttcccaa atagctggaa ctacaggcatgtgccatcac gtccagctaa ttttgtatct 480 ttagtagaga aggttttacc atgttggacagggtggtctc gaactcctgg ctacagtggt 540 ccacctagct cagcctacca tgagtgctgtgattacagtg cgtgagccac catgcccagc 600 ctctaaagtc tgtttgctat tcaaagtaaatatgacatgt gtttgagtca cacaaggaaa 660 gcactaaaaa agacggtggg gggaccgggcaaagctggcc ccgggggaca tgtcccccgc 720 ccaatcccaa tgaaaagaac 740 37 687DNA Homo sapien misc_feature (499)..(499) a, c, g or t 37 gcgtggtcgcggccgaggct acctaagcaa tcaagctggc cagctggtgc accatgggag 60 agatgatcaccaaacttttc ttctctttga ggtcacacac ctagattacc tgccccagtc 120 tcccttgcagttagatctgg ctgtgaggtt gagttttagc cagtgggata acagatggaa 180 gtttccactggcctaaccca taaattcctc cacaactctt cccactttta atcttatgcc 240 cccatgtcgtctcttctccc agccttcttc gtctcaataa atgtcactag cacatatcca 300 gtcattcaaggaaaaacaca atggagaaaa ccatcctcaa ctacccattc cctttacctc 360 actctttcccagcatcctgc aaaatctcgc tccaaatata gctccagttt gtccacttcc 420 ctcccttttctccagtctat aaccttggtt tactccatca ctatctctca attagactat 480 tgaaataaaatcctacctng gaatctcaaa aaaaaaacaa aaacaaaaaa aaaaaaagct 540 ctcgggggtcaacccatggg gcaaacgcgt gttccccggg gggacaatgt gtttcccggc 600 ccacattccccacattggcg caagcacacg ccgcgacgcg gccggacggc cgcgcccacc 660 cacgaacgcccaccgcggac agcgaca 687 38 148 DNA Homo sapien 38 gaggtatcga attatgcgatgggcctctag atcatctcga gcggcgcagt gtgatggata 60 gcgtggtcgc ggcgaggtacaggaactggc agccgcactg gctgccagaa acgtcagtgg 120 tgctgcccat tcggcgaaaggttaggga 148 39 815 DNA Homo sapien 39 cgcccgggca aggtccctcc tccttttttttttttttttt tttttttttt tttttttttt 60 tttttttttt tttttttttt ttttccccctttttgttttt tttttttcca aaaaaaaagt 120 ccaaaaattc cccccccccc cccttttaaacccccgtggt ggtgtcgccc tcccttgtgg 180 gaacgaaaca aaagcgggtg gtggtcgccgctgatgatga cgtcaaccac ctagcacaaa 240 aaaaacggtg gtggtgattc tgtggggcgccccccctcgt agacatatca tcatcttata 300 taattagtta gtggtgtggc gccggagggcaggggcacac actcatcaat atctttttta 360 taatcattat tatggggggg aagaaaaaaatcatgttatc accccccagc ggtgtggtat 420 ccaacaacac acaaaagaag agacagtgagtaaaacaaca aatgagtgag tgagaagaca 480 acggcaggcg tgtggtgaca gaaacaatgactgtatgcag tcgctagtct ggagcgaacg 540 tgcgtgttat gtcatcctcc gcccggaatagataaaaaga tgggggtggc tacacacata 600 caggaggacg acggaggaga agagaagatactacatcaaa caaaatgggg ctgacgctat 660 tattatattc gatcggggag aagaactatatcccgacaga gaagacggag ggagaagcaa 720 taacaacgac gaaacaaagc gtcacaccgcggagagaaga aatgggcttc ccccgccaca 780 ccccccacaa ccatctccaa caaccacaaccaagt 815 40 138 DNA Homo sapien 40 gccagtatat gcataaggat ggtgaacaggaacatttagg agcatttgat cttatgaact 60 ggtggaccgc gagcccttag ctagacaatgagaggagaat gtacaccatg taattatatc 120 tgcttgccca cgaaacaa 138 41 79 DNAHomo sapien 41 tgaagataga tcatataggg cgcatgggtc actagatgca tgtcgagcggcgcaggtgag 60 gatagcggcg ccgggcggt 79 42 887 DNA Homo sapien 42atgctggtag tgtttgtgtt atatggtgca gcgtccagaa gtatgtgcca agctgcatta 60atttgaatcg gccaactgcg ctatgttaga agggatgcgt ttgacgtagt atgggtgcgc 120tcttcccgct tcctcgctac atattgactc gcttgcgctc ggtcgttctg gcctgcgggc 180gagtagagaa tcagggctca ctcaaaatgt gcgggttata tacggtttat ccacagaatt 240caggcgataa cgcaggtgaa aataaccatg ttgagacaaa aaagtgccat gctaataaag 300gccaggaacc cggtaagaaa gggtcgaggt ttgtatgcga cgtaatattc catatggcat 360ccagccccca ttgagtgagt catttaacaa tcaatttcgg ccgctcaaag tcagaaggtg 420gggaaatcct gactaggaac ttataaagga ataccaaagg gcggtttccc ccacatggaa 480gcatcccatc gtgcgcaatc tccatgtacc cgaccctgcc gactttaccg gattacccat 540gtccgtgcct atctacgctt agggaaatgg tgtggcagca tatcttcatt agctcatagg 600ctggaagcgt aatcataagg tgacggggta agagtacggt agcgattcaa tagcttgtgc 660atgctgttca acagagaccc ccccggttca gcccaactgc tgccgcctta ttccggtaag 720tatataagtc atgaagttca gacccggata aagacacgac taaatggaca gtgaaagaga 780gccactggtt acgcaggtta agagcaggag gaatttaggg agggaaacga gaactgtaag 840tgttggctaa ctatcgggat agactaaaag accgtattga gattagc 887 43 425 DNA Homosapien 43 aatgtgttgc acagtgagga cgagtttccg tgtcaatgta gctgtgacaaaggtatcaga 60 gacgatcagg ggtatgagaa acccacgtgg atcatagcaa agtattacttggcagcaaat 120 agtgtacctg aaatagacgt gaattgaagg agaatgaaga aatagaaccatgtaacatca 180 ataaagacaa aggaaataac acacacattg accaacaaaa aaaaggcaaagaaattagaa 240 gaatttacat tggaatagaa acagggtaca tatgacatca aacacccaaaggctaagagt 300 tgcaaggacg agaccttata agaaagactt gaaggtcact tcaactgattcacataagat 360 agtaacactg tgtaaaaaat aggatatcca gtcaacaaat accaaacaaaaaatacaaaa 420 gagaa 425 44 406 DNA Homo sapien 44 caggagaatc acttgaacctgggaggtgga ggttgcggtg agctgagatc acaccactgt 60 attccagcct gggtgactgagactctaact aaaaaaaaaa aaaaaaaaaa aattgattgg 120 ctgtgcctca ttacaaatgcttttgatgtt ggagtgctgt tgttggaaat tatttttctt 180 ttcggggtct tcaaaatttcaagaaaagtt ggatgattgg actttggaag attacaaaaa 240 aaaaaaaaaa aaaaaaaaaaacgcttgggg ggtacttcct gggtgctata ggtgtgtgtt 300 cccgtggggt ggaattgtggttcctccggt ctcaacaatt ctccccccac aaacattagc 360 agacgcaaac gtgggagggagaagaggtga ggagaaagag gacata 406 45 1267 DNA Homo sapien misc_feature(358)..(358) a, c, g or t 45 cgtggtcgcg gccgaggttt tttttttttt tttttttttgggggtaaatt ttttcttttt 60 taaatgggtt attcccataa ataaaatctc ttttccacttgaatatatta aaattataaa 120 cactcatttt acaaatttat tcccaggtat ttacatttctcccctctccc tctccccaaa 180 aacgcataca ttttggatta aatataacaa cattctcaggctcttataaa accacctgat 240 ttctcgtggt gtgtgcacgt ttagagaggt gtgcgaagattggctgtcgc ctctctctca 300 cacagagaca cactctctca gtgtggtgtg tgtgtcctccccccttctca ggagagangg 360 ggagtgtgga attgtcgccc ctctcccaca ttatacacttttgtgtgccg tcaaagggag 420 cgcgagaata taaagcgcgt ggggggcggt ataaatcttcgtggtggtgc tcatatangc 480 gcgtgtgttt ctcgctgtgt gtgtgtgcaa caatgtgtgtgtatatctcg ccgggctcta 540 cacacaaatt ttctcacaca ccacacacac acattattctcgggcgcgcg acacaaaacg 600 caaaaaaaaa gaagaaaaaa aaaaaaaaaa aaaaaaaaaaaaaaatgaaa aagaaaaaaa 660 aaaaaaaaaa aaaagaaaaa ataaaagaaa atcaaagacaaacagaaaaa acataaaaaa 720 agaaaaagca caaaaagaaa aaaaaaaaaa taaaagaggaaaaacaaaca gaaaagacaa 780 aaaaacaaaa aagaacaaaa aaaaacagac aaagaaaaaaaaaaaaagaa aaaaaaaaac 840 aaaaaagaaa aaaaacagga acaataaaaa aaaagaaaaacacaaaacaa cagaacaaca 900 gaagaaaaaa aaaagaagag agagagaaaa aaacaaaaagaaaaaaaaaa aaaacaaaaa 960 agaacaaaaa aaaagaaaaa aaacaaaaaa caaaaaacaagaaaaaaaaa gaaaaaaaaa 1020 caaaaaagca aaaaacaaag aaagagaaga ggaaaaaataaagagcaaaa aaacaaaaaa 1080 aaaaagaaaa atgacaaaaa acacgaaaaa acaagatacaacaaaacaag aaaaagaaac 1140 aaaaagaaaa aagaagaaac acaaagaaaa acaaaaaaaaacagagaaga aagaaaaaaa 1200 gaaaaaaaga aacaaacaaa agaaaacaga agaaacagacgaaaaaaaaa cacaagaaga 1260 caaaaac 1267 46 239 DNA Homo sapien 46acgcagcaat acgagcatga catatggggc tcacgtaata tgtgcggtgc gtccggattc 60tttcctgcag atagatttgc ctctgtgtct tgggcgaact ccagggtgag tcgattgagt 120agcccaaacg gtatccttac cagataaata tgcatatgat cttcgaagtt attgaccgca 180atatcaacgt gaggactgta taatacacat tcatgaaaga tggaccttga aaacgcggg 239 47234 DNA Homo sapien misc_feature (190)..(190) a, c, g or t 47 cggccgcccgggcaggtttt tttttttttt tttttttttt ttgtggtgaa gtggtaaatt 60 ttttttataaaaaaggttgt gttttcccac agtattaaag cggggggtat tcctagtggg 120 ccataggcgtgttcccggtg tgtggaaatg tgtgtatccc gctcacattt cccacaaact 180 tacgagaagnatgagagtag actaagggga aatgcgagaa gatgcatacc tagg 234 48 964 DNA Homosapien misc_feature (364)..(364) a, c, g or t 48 gctttttttt tttttttttttttttttttt tttttttttt tttaaatttt ttttttccaa 60 aatattggcc ttttgaaaaaatttaacaat acccgtggtt gtgtgaatcc cccactattc 120 tcaaatgtgg ggctttacacccagataagt gtggtggggt ataaaacaca gaacgctggg 180 tgtttggcgc aattgtgcacttttatctct ctcaaagtga ccatacacgt gcccaagtga 240 attctccaga agagaacctcatatcacctc tttataattt ttctcccgcg gagaaattat 300 aaaaagagaa aagagtctttggggcgtaaa cactcgctgt ggtctccaat agctgtgtgt 360 cccncgtgtg tgtgtgacaatgtgtgtgta tctctcgcgg ctctccacaa attttccacc 420 acacaaacat tttcgggtgacagcaaaaag ggtgtcaaga gcgaggagag gcaaaaaaag 480 gaagggaggc agaaccgagagagaggcggg gagtaagcag acgacaagac agtaaaagtg 540 aggaagacaa gaacaaagcaagtggcgaag cgagcaaaag ctaggagtag gagcagcgta 600 ctgaagatgc cattcgaaggataagtactg cgtgtagaag aggatgcaag cacggacaaa 660 gaacatagat aggaggctgaataactgcac gcaacgacca gccagacatt aggatgctac 720 tggtgtagat ggagacgggaggacagagaa tgcggtgagg gcggtcgcac gaaaaccagc 780 aacagagggg gtagcgcgcacagacagcag agaagacaga acgtaagcag tacgtgagca 840 caaaagcagg gtaaacagccccaccgagcg aggagagcaa aaaagctata ctcgaacaaa 900 acaaaaaaaa acaaaaaaccaaaaccaaga aaaaacagaa aaaaaagaaa acacccacaa 960 gaca 964 49 957 DNA Homosapien 49 cggtcgccgg gcaggtacgt gtttaatttg agtattgatc aaaaagcgtttattattaat 60 tctagaatca gtcaaaatga tgttctgaat agaaaataag atattcggtagtagctgtac 120 taaggcatag actcttattc aaatgagaag taactttgct aaacaccaagccttaatcgg 180 cattttataa taagaacatc aataccaata tttaaaataa ctgtatagccagatatgcta 240 gcactcgaaa attttacgaa ctaaaagtcg aacatagaag aaattgcatatccatgtctg 300 cataccccta aggatgcctt ttggtgtctg atattttttg aaaatgagagtggtcccaga 360 aatggttcat gttgtacaag taatttgtct ccttatgttt gtttccttatttatacacgg 420 ggtggactgg agagaaggga caaagtcaat ctgtctgtac atccgcaccagtgtggtacg 480 gtgcatcttc catgttacct ccctcttgga agatcagaca ccatatgttttacaatacgc 540 gttgcccatg gcagtattgc ggcgaaagtt gcgtttgttt tgtttcaataggggctggtg 600 tacatggttg tctaaatata gtgtgaagtc ttcaatttct gaaggaaactaaagagacga 660 catatgtgtc ccctaagggg tctactaagt ccccatattc tctcttttggggctttaaca 720 gtggctagcg ggtcgagaat tcgcaagaac ttcccacgtc acgtagcttcattggtggtt 780 gtggctacct atccgatgag ttctttgtca ctttaggttt tgttccgtccagggccgctt 840 agtagctaat ttagtcttcc taaattcctt ccccctgtcc ccccaaaaacttgtggtgtg 900 ggttttctcc ggggaatctt gggtcctcgt gtgggggaaa tggtccccgtcgagcca 957 50 108 DNA Homo sapien 50 atggtgcagg tgccggaggg tgggagaatgaagtgatgat atgagcgtcc tgtctgtggc 60 ggagcttagc gtctcatggc atagctgtgcctgtgtgaag ttgtgatc 108 51 124 DNA Homo sapien 51 atggttgggg aggcgcataagagagtgtct atactgaggt aaagaaatag ttacgaaaat 60 taacaacgga agtagtcattctcaatctcc taaaaggtgg gagtaggatg caaagaaaag 120 aaag 124 52 598 DNA Homosapien misc_feature (469)..(469) a, c, g or t 52 gtcgcggccg aggtcccccctttgattatt tttttgcttt ttttgttttt tcttcatgat 60 ttgaaagacc tcgcctagattgttttcgtg gttattgctt ggagggagca acacaaataa 120 aaagttgaga ggcccatggtgtaatactgg gggaaaatgt ggggacgagt ccaaacaaca 180 tgtgtaccgc ttttttccggggagaaagaa actagtagca ccttgtatcc cgtcggggaa 240 cagaaatccc ctcatttaggcgcgtctgcc ctgattgccc gcaagattag tatcggttat 300 tcaagagggc acccagattatatactacgg gaaggcgcgg tggggaggca caggtgacac 360 tggaaaggcg ctctcgctcgtggttggagc catcgtgtcc accgctggcc tccacccttc 420 tccacacgca aatcttgggcagggaaaatt cctggctgtg gtctataana taacactttc 480 ttaagcatgc cacaaaaaacaaaaaaaaaa caaaacaagg tctgggggaa cccctggcgc 540 aaagggtccc ggggtaacatgttgtaatcc ccgggccaca aaattccccc acaaatat 598 53 481 DNA Homo sapien 53gagcgagagg gcggagaggg gagatactat atgggcaatg gtgcttagat gctgctcgac 60ggcgcgggtg atggatagtc gcggcgaggt acattttaaa ctagattgct agcctatgta 120tttgacatta tcattttcag tgatgtataa ctgtcacttt ttaattttat atattatgta 180tttatttgat attagattta ataactatat aaattttatt cattctttat ttgaatagaa 240ataaaagttt taagagaggt tataaatcac tttattcaag tatttagtat atgataatcc 300agttaactct gcgtagacat agatctgttt accctatcat tttcttataa taaattcttt 360gaaattaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaacct tgggttattt cttggacaaa 420tttttccttg tttaaaattt tttaattcgc ccaaatttcc cacaaaaatt gcaaaagggg 480 t481 54 878 DNA Homo sapien 54 tggtcgcggc cgaggtctta ttttttttttttttattttt tttttatatt aaagccaaaa 60 gttattggtg ggggaaacct ttttgggccattcagggatt tcccctttgg ggaagggaac 120 ccggcgtgcg atgtggtggt aggaatcccccgtggggtga aaacgttcgt gtcaccgtgg 180 tgcactaaaa gcagaggcac taacggggcagcggtgacag tgagagggtg gcccactcat 240 atagacgcag cccccacagg tgctcccacagaaaatgtag ccgaggtacg tgggctccgc 300 agaagcagtg ctatttcaaa acatatgtgtggtcccccct ggtttatgaa aatactgctt 360 acgaactatt tatagtgtag tgaataccaaaacgaaacgg tgattttgtg tggtgtgtta 420 cacaaccacg gtgccgtgtg ttgtggtctgcgtccgagtg gtcgcccgtg tgtgtgggcc 480 gaggaaggag acagactggg gcgttcgctcctacacgccg tgtggttttg gggtggctcg 540 ccccttctgt ggcctccgac gctcaggcgtattccaggcg cgacagaaaa cccacttgtg 600 tgcgagaaat ggtagtgcca accaagttaaactgctgtgg gtgtgcgatc aacctgtgtg 660 ggggccaatg acgcgggtgg tctccggtggtgggtaagaa atttgggttt attctcctcg 720 cttccactaa atgtctccgc aacaaacaattttgagagtg ataccagaac aaaaaagtac 780 aactacccaa ttaactttaa ttctaagtctaaccaaaagt attaccttat agaactacag 840 tcactatact tctataccta tagcgtacaagcaaatat 878 55 278 DNA Homo sapien 55 caacacactg atatcgtcta tggccatgttctctagatgc tgctcagccc gctgtgatga 60 tataaatgta gcttgggagg agggaatgtatactggatat tgtaatgatt taatttatat 120 tcagtgaaaa gatttattta tggaattaccatttaataaa gaaatattac ctaaacaaaa 180 aaaaaaaaaa aaaaaaaaaa aaggctggggttcttggcct gctgttccgg tgttgaattg 240 gttttccggc ccaaaattcc caaaaattcgagaacagc 278 56 123 DNA Homo sapien 56 aaacaaaaca aaaacacgaa aagacaacacaatcttgatg ttagtcacta tatggcaatt 60 gtgcctctag atcatgcttc gagcggcgccagttgtgatg gattggtcgc ggcgaggtac 120 aat 123 57 576 DNA Homo sapien 57tccacgacaa gctatacgag catcggtgca tcatggagca atgagagaga ctgttccacg 60catgttgtac acgctgtttc ttgattcaca ggtagagcct tgctaatagg agatgacaga 120gagagaggct cgcgtcggag ttccaagacg atggtgcaag gtcgtcgttc gttgtcattt 180gatactcctg gtttagccgc tattgcttca tcctcacatc ctatggcgta tgtcgtcatg 240gggtattagt aagtctcttt ttgatcctag tgacaagtct tcatggcctg taacactgag 300attacttggg atcgatggtt caattcccga gagtattgag gtggacaggg gttgttaccg 360tcgagtcctg gaagatccat cacgtagagc tcgaaaatgt ctctattaca taacgttgga 420ctgaaccccg atataaacat cagtattggc attcccggaa cgcatcggtg atacccatat 480ggcttttgtg tccgttaaat tctattgggt tcattaagca ttgttttacc gttgtggtga 540acaagttgtg gttattccgg agtcaagcaa attcca 576 58 1043 DNA Homo sapienmisc_feature (437)..(437) a, c, g or t 58 cgtgctcgcg cgcgagtgtagaagtcgtat gtaaaacaga gcaggcacgt aggtcgagct 60 acgcgcagga agtacatagatacattgaca cccacgtatg gacattcccc aacgtatagc 120 agttcctcta gatacttcatgttgtgcgag acatgtgcca taacgaattg gtgtcgtccc 180 ccaatacact tatgagcgtataggacagta tagattggat gggacgtgag aacagagaag 240 tgaaggttat accgtatgacatggtgacat agcatggcag atattgtaga ggtccaacgt 300 actcacatct gccattagtagacggtgtag cacgtgtaag cgatggcatc aaggctagac 360 atgtaaatag tataggttccattatgtctt gtcttgtctt gctcggtgta taattctatt 420 gcttactccg tcgtctntccattttgacta catgacctat ataggggatt acacccaagt 480 ttaanngtta agaaggnntgtgtaagttgc aagtggnttg ggaactgaca aactttgact 540 ccaaantatt aacanncctgtgttccactt ctccattttt caaantgtgc gngnnctgga 600 natacnttcc caaagaacacagggttacgg cantgaacga aaaacaaaca gaaaatctca 660 aattcagaaa acctctttcagggggggttt ggggggtccg tggatagtgg gcagaacaag 720 aattacaagt tccacaccagggagaattgg gagccaattt tcacaattag agggttaagt 780 ggggctgacc gaggctatttacttggccca gcatgtgggc acagaattgg agccaacagg 840 ctggaacaga gtttcggggatttatataag caccttagaa gtctctggat gtcagggcaa 900 tcttgggtaa gctcaaacattacgctaaaa cttccagggg gaaaattctt ccaggtagcc 960 taagctaggg gtaatccattggccataagc tggtcctggg gtgaacttgg ttatccgctc 1020 ccaatccccc attaaaaacaaag 1043 59 703 DNA Homo sapien misc_feature (407)..(407) a, c, g or t59 gctttttttt tttttttttt tttttttttt tcctttttta aaaaattgac ttggcttttt 60tactttgggc ggggggggcc ggcttgaggg ggtagggtgc ccgggggatg ggggggctgg 120tggaaataat gacaaaaatg tgttgaaagg aagggggtgg gtttggaagg ccgggcccgg 180ggggcccccg ggcccgtttc gggaagggga cacgccttag aggaaggaga ttcttgtgca 240acgccatatg catggcgccc ccaatcatga ttaagaaatt tcctggaaaa catctacgtc 300tggaccatca ctgggtgggg ttgcccatgt tctcttctat cttgagcggt gccagggacc 360cccagggggt ccctctctgc tgtctcttgt ggtgagaagg gagcgancgg caatgtactc 420tggtgatgac catgagaagc gtctgggggg tgtcagntcc agtggtgcac ataaccagct 480ggtgcccctg gtgagtgaga aaatgtgtgt ttacgccacg ctccaacaaa tacccaccaa 540caaaatatca tggaggagac gacaacctgg ggccacacgc gtcggagtca ggcgagaaaa 600atcaaggcca cacaatgaaa cacaaaacga cagagaaata aaaacacaac gggaacaccc 660acaaaaaaaa acaacaacaa aaacaagcaa gaaggagtgg cga 703 60 2110 DNA Homosapien 60 aaaacaaaaa aaaaaagact gatcagacct catatggcgc aattgtttcctctcaatgca 60 tgctcgagcg gcgccagtgt gatggatggc actggtggaa taaaacaacctgcatatttt 120 actttgtttg cagatagtct tgccgcatct tgtgcaagtt tgcagcagcataggttggca 180 tgctcagcac acacacaaca caaaacaaaa aagccctatt acaaggttggttgccctgtg 240 gtgtatgtgc cctggtgtag catgtgcatg cgagcatatc ctgcacacattgcacacatg 300 ttatgtatca gacacgagac actcgatcac cgtgggtgta acaaacgcaatgtgctaatt 360 tgcaccatac aggcctttgt ggcgtgtcaa ctcagagtgt ggtcctataagcttgattcc 420 ccttgtgttt gtggacaatt tgagttacat cacgagcctc cacaaattatccccacagac 480 aagacaaata actgcatgca cagttgtggc aactaacggc tacgacagcgacataccaac 540 cagtagagcg aaacgcgaca agaagagcga gacgaggcac gaagaggctacacatgcaag 600 gcacacacag accgccccac aacacccctc accaacagac agacagcggagcacacagat 660 ccacaggacg ccgcaacaag acacgaacac aaacgacaga cacagacaaaagcaagagaa 720 gcagacgaga agacagcagc gcaaagcaac agagacggag agacaagagacggaacgaca 780 ggaggaagag aggagcagcg aggaaagaca gagaagaaag agaaggcggaaggaaggagc 840 gacgacgaag ggggaggacg gaagagatgc gcgcggtcag atgagaggaggacagaaaga 900 ggaggatgag ggagagagaa gacagaagaa cgaagagacg agagaccacccccccgcgga 960 cgagacaacg aagcgggagg agcgcgaggc ggagaacgcg aagagtcggcagacggagag 1020 ggacgagaag aaggtgacga tgcgagagcg ggtcgcgagg cggacgaggggagagagaga 1080 gcgaagacga acgagcgcga ccgagcgaga ggaggaggag aagagagaaggaagaagacg 1140 acgacaatgg cggccgaagc aacgaccgaa cgaagaagga gagagagcgaagagacgaga 1200 cggagagaac gagcgaggcg gaggaacgaa agaagaaaac gaggagcggaggcagagcga 1260 ggagaccgaa cggcgagaga agagagcgag gcacccaacg gagagagaaacaacgaacga 1320 gagacagacg agacaagaac tcagaggcga agacgcacga cgcacagacagagaagagag 1380 aagacaagca gagaagcgca ccacggccag tcagcggagg cacagccccaagaaacaacg 1440 acgggaccgc gagaacagag gagacaaatg agagcggagg ccacacgaacgacagtgaag 1500 gaccaggacg agaccagcag caaggagaaa cgacgcatga gaacacacaacatcaaaatc 1560 agacagacac gcagcggcac gcacgacgca cgacagcgag aagagagacacacgacgaac 1620 aagcatgcaa ggagcagagg acagcacgaa cgcaagcaac cagagcaaaagcaagagagc 1680 gcagggaaga gaaggggcga cagcagcaac agacgagcga cacagaggagaaagagacta 1740 gaaaaagaga agacacagaa gacgcgacac ggaagacact aaagacacgcgagagaagag 1800 aggaggagcc catgtcgaag ggagaccgaa gacgacgaaa gagaaacgaaacccgacgaa 1860 cgcgagcgag gcgcagaacc acaacagagg atcaaagaca gagggagcaaagagcggccc 1920 aagagcagag ccgacacgca gagaccggca gcagcaagac acacgacgcgacgacgagga 1980 cgaggacgag agcagaagaa ccgcgacgaa cagcccacag agatcacagagggccgacac 2040 agcgaaggag gagagaaagg acacacgaga gaagcgacag gcactgacgacggtacgggc 2100 ccgcgaagac 2110 61 3413 DNA Homo sapien 61 tttgggaaagagggtcccca actcgagtgc cgcgtccccg gttatctcgt gaaatgtcgg 60 cacatgttatgtggcccacc atgtggaaca taagccgcca ggttgggaca aataagcccc 120 catcggccagaataagctgg attcaaatgc cggccattca ggttgaaccg agggtaggct 180 ccaaaaatatttcttcttct agctaatggg ctttggcaca acacacgata accaatggct 240 ggctgctgaaatatcagccc tttggggtgg ctggaaggta agtctagctt tgggaacact 300 agacatatataatcgatatt tacttatatt gcattatata cagtgaagtc ccatacacct 360 aggacatacccgcaagcaag ctttttcatt cctgctttac cggtatgatc tcgtctaaac 420 aaacatttcatttcagaaaa tctgcatcaa ttttcacggg ccattcacag tgcacaaact 480 gaaaagggcttttttttttt tttttctagc tccaccatct ctgcaacttg ccaagatgcg 540 gcaagactatctgcaacaaa gtaaaatata caggtttttt attccaccag tgcctcagat 600 agataggaaaaagatatgat tacggtttaa atccatacat agcagcttac aatacttaag 660 atgatgaacacatggcagtc aagacaggta atttttcctc acaacagtgc atggctaaaa 720 ataaagatctaacaacgatc tgtgaaactg cactgcaacg tcaaggttcg ttcttccctg 780 accctcccccgtataatcaa atgaatatcc cctttaaaga tgaactccta ctaattattt 840 tgggcgttttcattcagctt tgcgcttcaa tccagggatt tttgcttgga ttttagccat 900 agcatctttaacattcttat ttgcaagtcc tagataatga tctacctatg ttggtgcctt 960 gtttaatggtctgacactac tgattttggc tctcatttca ctcttcagtg ttcctgttat 1020 ttatgaacggcatcaggcac agatagatca ttatctagga cttgcaaata agaatgttaa 1080 agatgctatggctaaaatcc aagcaaaaat ccctggattg aagcgcaaag ctgaatgaaa 1140 acgcccaaaataattagtag gagttcatct ttaaagggga tattcatttg attatacggg 1200 ggagggtcagggaagaacga accttgacgt tgcagtgcag tttcacagat cgttgttaga 1260 tctttatttttagccatgca ctgttgtgag gaaaaattac ctgtcttgac tgccatgtgt 1320 tcatcatcttaagtattgta agctgctatg tatggattta aaccgtaatc atatcttttt 1380 cctatctatctgaggcactg gtggaataaa aaacctgtat attttacttt gttgcagata 1440 gtcttgccgcatcttggcaa gtttgcagca gcataggttg gcatgctcag cacacacaca 1500 acacaaaacaaaaaagccct attacaaggt tggttgccct gtggtgtatg tgccctggtg 1560 tagcatgtgcatgcgagcat atcctgcaca cattgcacac atgttatgta tcagacacga 1620 gacactcgatcaccgtgggt gtaacaaacg caatgtgcta atttgcacca tacaggcctt 1680 tgtggcgtgtcaactcagag tgtggtccta taagcttgat tccccttgtg tttgtggaca 1740 atttgagttacatcacgagc ctccacaaat tatccccaca gacaagacaa ataactgcat 1800 gcacagttgtggcaactaac ggctacgaca gcgacatacc aaccagtaga gcgaaacgcg 1860 acaagaagagcgagacgagg cacgaagagg ctacacatgc aaggcacaca cagaccgccc 1920 cacaacacccctcaccaaca gacagacagc ggagcacaca gatccacagg acgccgcaac 1980 aagacacgaacacaaacgac agacacagac aaaagcaaga gaagcagacg agaagacagc 2040 agcgcaaagcaacagagacg gagagacaag agacggaacg acaggaggaa gagaggagca 2100 gcgaggaaagacagagaaga aagagaaggc ggaaggaagg agcgacgacg aagggggagg 2160 acggaagagatgcgcgcggt cagatgagag gaggacagaa agaggaggat gagggagaga 2220 gaagacagaagaacgaagag acgagagacc acccccccgc ggacgagaca acgaagcggg 2280 aggagcgcgaggcggagaac gcgaagagtc ggcagacgga gagggacgag aagaaggtga 2340 cgatgcgagagcgggtcgcg aggcggacga ggggagagag agagcgaaga cgaacgagcg 2400 cgaccgagcgagaggaggag gagaagagag aaggaagaag acgacgacaa tggcggccga 2460 agcaacgaccgaacgaagaa ggagagagag cgaagagacg agacggagag aacgagcgag 2520 gcggaggaacgaaagaagaa aacgaggagc ggaggcagag cgaggagacc gaacggcgag 2580 agaagagagcgaggcaccca acggagagag aaacaacgaa cgagagacag acgagacaag 2640 aactcagaggcgaagacgca cgacgcacag acagagaaga gagaagacaa gcagagaagc 2700 gcaccacggccagtcagcgg aggcacagcc ccaagaaaca acgacgggac cgcgagaaca 2760 gaggagacaaatgagagcgg aggccacacg aacgacagtg aaggaccagg acgagaccag 2820 cagcaaggagaaacgacgca tgagaacaca caacatcaaa atcagacaga cacgcagcgg 2880 cacgcacgacgcacgacagc gagaagagag acacacgacg aacaagcatg caaggagcag 2940 aggacagcacgaacgcaagc aaccagagca aaagcaagag agcgcaggga agagaagggg 3000 cgacagcagcaacagacgag cgacacagag gagaaagaga ctagaaaaag agaagacaca 3060 gaagacgcgacacggaagac actaaagaca cgcgagagaa gagaggagga gcccatgtcg 3120 aagggagaccgaagacgacg aaagagaaac gaaacccgac gaacgcgagc gaggcgcaga 3180 accacaacagaggatcaaag acagagggag caaagagcgg cccaagagca gagccgacac 3240 gcagagaccggcagcagcaa gacacacgac gcgacgacga ggacgaggac gagagcagaa 3300 gaaccgcgacgaacagccca cagagatcac agagggccga cacagcgaag gaggagagaa 3360 aggacacacgagagaagcga caggcactga cgacggtacg ggcccgcgaa gac 3413 62 585 DNA Homosapien 62 cggccgcccg ggcaggtccc cctttttttt tttttttttt ttttttttttttggtttaaa 60 aaagtgcctt tgttttttta ctttgggggg ggggggccgg atgagggggtaggggggccc 120 aggggatggg ggggttgggg aatattcaaa aaatgtctct gaaggaaggggggtgtgttt 180 gagggccggg cccgggggtg ccccacggct ccgctttctg gggaaggggaccggcctttg 240 agggagggag ttctgggcag cccatagatt gggccaccaa tctcgatatttagaaacttc 300 cgtgaaaaat attcttacgc tggcccatca tgctgttggg ggtgcccagtatctcttcat 360 cacatggggg ccaagggacc ccgtgtctct tccgtgtgtt ctctgtggtgagaaggagca 420 gctaatgttc ctggtatata ccagagaagc tgggcggggg aacgaccgtggcgccaaacg 480 ctggttcctc ggtgtgtaga aattgtgttt accccggctc ccaatttcccccacaacaac 540 agcgacaaac caaacgtgaa aaacagagat aaacataaag agtga 585 631066 DNA Homo sapien 63 cgagcggccg cccgggcagg tacaaggcct tttttttttttttttttttt ttttttttcc 60 ttgttttaaa aaagtgcctt tgcttaataa cttttggcggtggggccccc acttgagggg 120 gttgggtccc ccgggaaggg ggggcctggg gaaataataacaaaaaggtc tggaagaaag 180 gggggtggtt ttaaagcgcc aaggcccagg gtgggccccccgggccccgc tctcgggaga 240 gggaggacac gccttgaggg aaggatgtct ttggcagacggccatagttg gcgcccccaa 300 ttcatgttta atagaaattc cttgaggaat atcttacgcttgccccatcc cctggtggtg 360 ttgcccagct tccttccatc tctgcgggtc aagggaccccggggtccctt ctgggtcctt 420 ttgtggaaag cgcgggacgt ctctgttttc catagaacggcgtggcggcc gaaacaccca 480 ggggccccaa taggccgtgg ttcccccggg ggtgtgacagtctggtttta ccgccgctct 540 cccaaacttc ccccccccca ccattgtcag cagcaaaaagtcggcccgct gggggctggc 600 gctacgatgc taaacacagg atcatcacga gaacacgctgcacaggcaac aaaagccggg 660 cgaagcaaga cacaagccca cacgaagaac gagatcagcaagcaggcgac agaacaagca 720 tcgtaacgac acactgacct agcgtagatc tactgagcgtgccagatcag aggcgaccca 780 ctacaacagc ttcattactg aacacgtgag ccgatcgacatcacagtacg ctccaaaatg 840 actaaggtca agtaacacag atacaatcga aacaagttgctgaccgtagt tagtacacac 900 aactagatgt gaggatacta gagcaacaaa cgagtgaaaccaagaacaga cacgtagaga 960 acagaagaag acgcggggga ctatacaaga cgacaccaccacgaaaagac aacaccataa 1020 agatacttag acgagcgaag cgaagcaaat acaaaagaggtacgac 1066 64 771 DNA Homo sapien 64 tcgcggcgag gtcttttttt tttttttttttttttttttt tttttaattt aaaaaccaaa 60 atttttttgg tgcggggaaa ccttttttggcccttttttg gttttcccct tttggaaggg 120 gaaaccgggg gggaaggggt ggggaatcccccggggggtg aggtgtttgc ccgttgttaa 180 aaaaaagctt tacgggggag gggcacgaggtggaggagtg gcccaacaaa atatcacagc 240 caaccgaggt gccccctaga aacaaatgcagccgaggggc tcgacagaca acagaatact 300 caaaaaggta gctgcgcccc cggttattataaacaaccta ataaaattca cagagttata 360 ctaatagaag cacagcggtg tccttgtggtgtggctatta taacaaccaa gagtagcggt 420 gtggttgtgg tcacggtcct ccgtggtgtgcggccgagtg tggataagtg aggtgtggtc 480 tccaagaccc gggggtggtt ggggggcgctccccttgggc gtcgcagagg tcaagcggtg 540 tctcacagac ggggggaagt ataaaacctggggctggaaa gaggccccga aggaagagcg 600 tgggggggtt aacctagggg gggcagaatagcgagtggtc tccggtgggg agtgaactgt 660 gtggtctctc tcggcgtccc acaacctccccccaacattc ccggctacga caaaaaagag 720 taaaaaaaaa agaaaaggac acaaagaaaaaaaaaaacaa tagacgaagg a 771 65 389 DNA Homo sapien 65 atggggcgtgcgactcgtcc cacaaggaag gatgtgttag aaacctgcca ctagacagag 60 ggaggagaaagtgaagaagg cggtcagcag acaggagaag agcaagcggt tccctcagag 120 gagtgaacggtgctagtacc atcagagtgg accatagcac tcaagccctg acaccatgtg 180 gaaagcattaacacagatgg acaagacatc acaaaacatg aaccctacgt gagttgcccc 240 aattctttttgtaatataaa cttggctgca atcccaacca acactcatca cctggaaacc 300 tagtatataagcccagaaca aggcccccaa ggaaagggcc aacccactat catacctctt 360 gtaaattaaaagaccttgag atcacaatg 389 66 843 DNA Homo sapien misc_feature(415)..(415) a, c, g or t 66 gggcaggtac ccaggacaca aacactgcgg aaggccgcagggtcctctgc ctaggaaaac 60 cagagacctt tgttcacttg tttatctact gaccttccctccactattgt cctatgaccc 120 tgccaaatcc ccctctgcga gaaacaccca agaatgatcaataaaaaaaa aaaaaaaaaa 180 gtttttcaac ctttgtgtta agagcccact caagagttgttgttgtttag ctttctatta 240 tatttggtaa attttttcag tttttttttt tggcttttactcggttgtat tcctcctcat 300 tcccattttg ggctcattag acagtgttag tttctcataggaaattttcc tttttaataa 360 aatttgtgac taagcactcc ccttttggct ccctaagagtggtggctctt ccagngggaa 420 acagtgctgt gtaagtgagc actattggac gaagggggtggtgtatctcc gtgagtgctg 480 cgtcgagagg aggtgtctcc ccaataacct cgtgctcggcgaactacctg gcttttaata 540 cgtgccttaa agatagttcg ggtcctcttt atttaccattctcttctctt gggttttcat 600 ttttatttca ccaaaaggtg gggcggtttt caaaatttttgtggcttcta tctcgaaaga 660 aaaaaaaaac aaaaaaaccg ctgtgggcgg tcgaacccgtggggcccaaa cgcggttccc 720 tggttgttga aatttggttc cccgcgcccc ccaatccccccacttccctc ccacacacaa 780 acaaagggca gaacgacaag aaaaagaaga acaacaagaaaagaaaacaa aagaaagtaa 840 gtg 843 67 2336 DNA Homo sapien misc_feature(429)..(429) a, c, g or t 67 cacttacttt cttttgtttt cttttcttgt tgttcttctttttcttgtcg ttctgccctt 60 tgtttgtgtg tgggagggaa gtggggggat tggggggcgcggggaaccaa atttcaacaa 120 ccagggaacc gcgtttgggc cccacgggtt cgaccgcccacagcggtttt tttgtttttt 180 ttttctttcg agatagaagc cacaaaaatt ttgaaaaccgccccaccttt tggtgaaata 240 aaaatgaaaa cccaagagaa gagaatggta aataaagaggacccgaacta tctttaaggc 300 acgtattaaa agccaggtag ttcgccgagc acgaggttattggggagaca cctcctctcg 360 acgcagcact cacggagata caccaccccc ttcgtccaatagtgctcact tacacagcac 420 tgtttcccnc tggaagagcc accactctta gggagccaaaaggggagtgc ttagtcacaa 480 attttattaa aaaggaaaat ttcctatgag aaactaacactgtctaatga gcccaaaatg 540 ggaatgagga ggaatacaac cgagtaaaag ccaaaaaaaaaaactgaaaa aatttaccaa 600 atataataga aagctaaaca acaacaactc ttgagtgggctcttaacaca aaggttgaaa 660 aactttttat ttattttttt tattgatcat tcttgggtgtttctcgcaga gggggatttg 720 gcagggtcat aggacaatag tggagggaag gtcagcagataaacaagtga acaaaggtct 780 ctggttttcc taggcagagg accctgcggc cttccgcagcgtttgtgtcc ctgggtactt 840 gagattgggg agtggtgatg actcttaatg agcatgctgccttcaagcat ctgtttaaca 900 aagcacatct tgcaccgccc ttaatccatt taactctgagtggacacagc acatgtttca 960 gagagcacag ggttggggat aaggtcacag atcaacaggatcccaaggca gaagaatttt 1020 tcttagtaca gaacaaaatg aaaagtctcc catgtctacttctatccaca gagacccggc 1080 aaccatccga tttctcaatt ttttccccac tcttcccccttttctattcc acaaaaccgc 1140 cattgtcatc atggcccgtt ctcaatgagc tgttgggcacacctcccaga cggggtggtg 1200 gccgggcaga ggggctcctc acttcccagt aggggcggccgggcagaggc gcccctcacc 1260 tcctggacgg ggcggctggc cgggcggggg gctgacccccctacctccct cccagacagg 1320 gcggctggcc aggcagaggg gctcctcacc tcccagacggggcggcgggg cagaggcgct 1380 cccatctcag acgatgggcg gccgggcaga gacgctcctcacttcctaga tgggatggcg 1440 gccgggcaga gacactcctc actttccaga ctgggcagccaggcagaggg gctcctcaca 1500 tcccagacga tgggcggcca ggcagagacg ctcctcacttcccagacggg gtagcggccg 1560 ggcagaggct gcaatctcgg cactttggga ttacaggtgtgagccaccgc gtccagcctt 1620 tctttttact ggttctaatt attattattt tttattttactagtccttgc ctgcatacat 1680 ttcctccagg gtacagagct tatgtggttc tttgaccaaatactgttcta gtcattgcat 1740 gtattagaga ccaaggcttt cctcgtcaaa tcaattctgcatggttttcc catcttcttg 1800 gttttctttt tttttttttt ttttttaatt ttttattgatcattcttggg tgtttctcgc 1860 agagggggat ttggcagggt cataggacaa tagtggagggaaggtcagca gataaacaag 1920 tgaacaaagg tctctggttt tcctaggcag acgaccctgcggccttccgc agtgtttgtg 1980 tccctgggta cttgagatta gggagtggtg atgactcttaaggagcatgc tgccttcaag 2040 catctgttta acaaagcaca tcttgcaccg cccttaatccatttaaccct gagtggacac 2100 agcacatatt tcagagagca cggggttggg ggtaaggtcttagattaaca gcatcccaag 2160 gcagaagaat ttttcttagt acagaacaaa atgaagtctcccatgtctac ttctttctac 2220 acagacacgg caacaatctg atttctctat cttttccccacctttccccc ttttctattc 2280 cacaaaaccg ccatcgtcat catggcccgt tctcaatgagctgttgggta cacctc 2336 68 836 DNA Homo sapien 68 gagcggccgc cgggcaggtcccccccccct tttttttttt tttttttttt ttttgttttt 60 tttttttttt ttttttttttttaaaaaaaa aaacccgggg aggggggggg ggggggggga 120 aaaggagaaa agggggggggggtggtgtaa aaaaaggggg tgtggtgcgg gggaaagggg 180 gtggaaacgt ggcgtgcacgaggggggggg ggggtggagg accccccagc tgtggggggg 240 tggtgcatac aatacatgatacggaggggg gtaatacggg ggcgccaaac acagcatgtg 300 gtggaccctc atggtagacagaggaggaga ggagttcaat cattcgagca gacgaacgaa 360 aaacagccgg tgttcacaccaccaagaaag tgtgtctccc ccacgagggg gatacaacag 420 cggggggagg cagcgggcgatgtccgcagc gggggctgct gggaaagaaa agtcctacca 480 caaaaccagt cccttgtgggggggaagaag agagtgtagc agccgctcct ctgagagaga 540 gagaaagtat atcatagcagcgagcacgag cggaggagag agagcgcctc gcacaaagaa 600 gtgaggtgag cggctgccgcagcgcacaca aaataaataa gaggagggta ttaaacacgc 660 cggggggggc agaaaatataacaatagtag cggcgccgcg cgagaaacaa aggtggggga 720 aacaacacgg tgggacaccaacagaggcta tccaccgcgg gggtgaaaaa aagtggtttt 780 cggcggccca caaccatccccaccacaaac tggccgcaac aacaacaaca aggtct 836 69 411 DNA Homo sapien 69cgtggtcgcg gcgaggtttt tttttttttt tttttttttt ttttttaagt ccatgggaaa 60gggttttttt tcccccaaaa aattgcaagg ggaataaacc ccatttttcc aaaggcgaag 120gtacggcatt tttaaggtat tccgggcctc ccttgggaga agcaaagcga ttttaaaaaa 180gtttggcagc gcgtgaaact cgtggttgga aacattccca ggttaaaaat atttgagcaa 240aagagctttc tttaaaaaaa ccacacacac ttacaccttt tactagaaaa ccaagagtgg 300ggggttaact ctgtgcacat agcgtgttcc gcgggggtga aagtcgttta ctccgcctca 360caattccccc acaacatctg aggaggacag gggttgtgcg acgcgagcaa g 411 70 1343 DNAHomo sapien 70 cggccgcccg ggcaggtacc accattgtaa ggaaacactt tcagaaattcagctggttcc 60 tccaaaaaaa aaaaaaaaga ctgggcggta atcatgggtc gatagcgtggttctccgtgg 120 ggtgaaatgg gttagtccgc tcgacaattt ccaccacaac atacgagccaaggacaaaga 180 agaagaacac aaagcaaaac acaccagagg ggggaaacaa agaaaaagaaacagaccaca 240 gaacagcagt aaacagagca caaacataca acacaacacg cagaaaagacgagaaacaac 300 aagacagaag cgcacgcaaa acaaacgaca aaacaacaaa aaaatagcaaaaaacagaaa 360 gtgacggccc gtcgaagcaa gagaaaggag aaaagggaaa gagaggcggaagtgagcgag 420 aaggagaaga ggagaaaaga gagcagataa ggagaacaga ataaacgaagaagaaaaaaa 480 aaacagacgc agaaagagga gagggcaagg agaaagaaga agaagagagagaggatagcg 540 cgacgagcgg aggaagtaaa gacagacggg gagactgaag aggaggaacggagagacatc 600 ggcacataga cagaggagga ccgccgggat acaagaaaaa aggaacaaacggaagattga 660 gaaaatatga cgagcgacga agcaacgacc gaaaccagac cagcgcgagaggcagagaaa 720 ggagcagaga aacaaaaagc gacagagaaa ggcaagacga aaaagacaagcacgagctac 780 aggaggagcc aaagaatgag aaaagagaga agaagaagaa aacacgaagcaacaagacgc 840 agaacaggag aagagagaga aaacagaggg agacgaagag agcagaggagaagaagaacg 900 aaagtaggga gccaagaaga aacgaaacga gaagtacaaa cagaacagggaagaaagaga 960 ccaaaaggac aaaagaagga aacacagaga agaaaaaaga gaagaaaaaagaaaagccag 1020 agaagaagaa caggcaaacg aaagcaagaa gaaaaaacga cacaacgagagagaagagaa 1080 aaagacaaga gaagcaggag agaatggaaa tacgcagaag aggaggaaacagataacgaa 1140 gagagaagac gaaagaagag aaaaagacag cagaagaaaa gagagaagaagagaagaagc 1200 aagaaaagca gaagcaagaa cgaagcagac aaagagagag cggaaacgacaagaagagaa 1260 gaaagagaaa gagagacaga agaggagaag acgaggaacc ggagctgagcagacagaaga 1320 gagacgaaca gaacaaacag aca 1343 71 3259 DNA Homo sapien71 atgagcgcgg acgcagcggc cggggcgccc ctgccccggc tctgctgcct ggagaagggt 60ccgaacggct acggcttcca cctgcacggg gagaagggca agttgggcca gtacatccgg 120ctggtggagc ccggctcgcc ggccgagaag gcggggctgc tggcggggga ccggctggtg 180gaggtgaacg gcgaaaacgt ggagaaggag acccaccagc aggtggtgag ccgcatccgc 240gccgcactca acgccgtgcg cctgctggtg gtcgaccccg agacggacga gcagctgcag 300aagctcggcg tccaggtccg agaggagctg ctgcgcgccc aggaagcgcc ggggcaggcc 360gagccgccgg ccgccgccga ggtgcagggg gctggcaacg aaaatgagcc tcgcgaggcc 420gacaagagcc acccggagca gctctccctg gtggcagtgt ctgatgggag tgtccgtggg 480gctacgagga gcctcctgga cagagaaagg gcacagttcg gcattaagag gcagaaccca 540gccctgcccc agcttggcgg tgagggtcca agagcaatgg tggctgagct cggccagcgc 600gagcttcggc ctcggctctg taccatgaag aagggcccca gtggctatgg cttcaacctg 660cacagcgaca agtccaagcc aggccagttc atccggtcag tggacccaga ctccccggct 720gaggcttcag ggctccgggc ccaggatcgc attgtggagg tgaacggggt ctgcatggag 780gggaagcagc atggggacgt ggtgtccgcc atcagggctg gcggggacga gaccaagctg 840ctggtggtgg acagggaaac tgacgagttc ttcaagaaat gcagagtgat cccatctcag 900gagcacctga atggtcccct gcctgtgccc tttaccaatg gggagataca caaagacccc 960ctcaccccat cctctgacaa cccacaaccc tctcctctct gccaggagaa cagtcgtgaa 1020gccctggcag aggcagcctt ggagagcccc aggccagccc tggtgagatc cgcctccagt 1080gacaccagcg aggagctgaa ttcccaagac agccccccaa aacaggactc cacagcgccc 1140tcgtctacct cctcctccga ccccatccta gacttcaaca tctccctggc catggccaaa 1200gagagggccc accagaaacg cagcagcaaa cgggccccgc agatggactg gagcaagaaa 1260aacgaactct tcagcaacct ctgagcgccc tgctgccacc cagtgactgg cagggccgag 1320ccagcattcc accccacctt ttttccttct ccccaatact cccctgaatc aatgtacaaa 1380tcagcaccca catccccttt cttgacaaat gatttttcta gagaactatg ttcttccctg 1440actttaggga aggtgaatgt gttcccgtcc tcccgcagtc agaaaggaga ctctgcctcc 1500ctcctcctca ctgagtgcct catcctaccg gggtgtccct ttgccaccct gcctgggaca 1560tcgctggaac ctgcaccatg ccaggatcat gggaccaggc gagagggcac cctcccttcc 1620tcccccatgt gataaatggg tccagggctg atcaaagaac tctgactgca gaactgccgc 1680tctcagtgga cagggcatct gttatcctga gacctgtggc agacacgtct tgttttcatt 1740tgatttttgt taagagtgca gtattgcaga gtctagagga atttttgttt ccttgattaa 1800catgattttc ctggttgtta catccagggc atggcagtgg cctcagcctt aaacttttgt 1860tcctactccc accctcagcg aactgggcag cacggggagg gtttggctac ccctgcccat 1920ccctgagcca ggtaccacca ttgtaaggaa acactttcag aaattcagct ggttcctcca 1980aaaaaaaaaa aaaagactgg gcggtaatca tgggtcgata gcgtggttct ccgtggggtg 2040aaatgggtta gtccgctcga caatttccac cacaacatac gagccaagga caaagaagaa 2100gaacacaaag caaaacacac cagagggggg aaacaaagaa aaagaaacag accacagaac 2160agcagtaaac agagcacaaa catacaacac aacacgcaga aaagacgaga aacaacaaga 2220cagaagcgca cgcaaaacaa acgacaaaac aacaaaaaaa tagcaaaaaa cagaaagtga 2280cggcccgtcg aagcaagaga aaggagaaaa gggaaagaga ggcggaagtg agcgagaagg 2340agaagaggag aaaagagagc agataaggag aacagaataa acgaagaaga aaaaaaaaac 2400agacgcagaa agaggagagg gcaaggagaa agaagaagaa gagagagagg atagcgcgac 2460gagcggagga agtaaagaca gacggggaga ctgaagagga ggaacggaga gacatcggca 2520catagacaga ggaggaccgc cgggatacaa gaaaaaagga acaaacggaa gattgagaaa 2580atatgacgag cgacgaagca acgaccgaaa ccagaccagc gcgagaggca gagaaaggag 2640cagagaaaca aaaagcgaca gagaaaggca agacgaaaaa gacaagcacg agctacagga 2700ggagccaaag aatgagaaaa gagagaagaa gaagaaaaca cgaagcaaca agacgcagaa 2760caggagaaga gagagaaaac agagggagac gaagagagca gaggagaaga agaacgaaag 2820tagggagcca agaagaaacg aaacgagaag tacaaacaga acagggaaga aagagaccaa 2880aaggacaaaa gaaggaaaca cagagaagaa aaaagagaag aaaaaagaaa agccagagaa 2940gaagaacagg caaacgaaag caagaagaaa aaacgacaca acgagagaga agagaaaaag 3000acaagagaag caggagagaa tggaaatacg cagaagagga ggaaacagat aacgaagaga 3060gaagacgaaa gaagagaaaa agacagcaga agaaaagaga gaagaagaga agaagcaaga 3120aaagcagaag caagaacgaa gcagacaaag agagagcgga aacgacaaga agagaagaaa 3180gagaaagaga gacagaagag gagaagacga ggaaccggag ctgagcagac agaagagaga 3240cgaacagaac aaacagaca 3259 72 762 DNA Homo sapien 72 cgagcggccgcccgggcagg tacgcctgta gtcccagcta ctcaggaggc tgaggcagga 60 gaattgcttgaacccaggag gaagaggttg cagtgagcca agatcatgcc acatcactcc 120 aacctgggcaacagaacaag aacccatctc aaacaaacaa acaaacaaaa aaaaaaaaac 180 tctggtctcctttaggatat gttaccgtgc cccacgtgca gactagaaga aattaactgg 240 tgttttggaacctttttacg tgcaaacttt gaaaatgtgc tagaaaccca agcattgaag 300 aattaaattactgtgggtgg gaaacacacg ggcattgtgc attattgcat tattaccttg 360 ggtaggttatagtaaggttt agaaaggcat agcttgggtg gatattctga accaccattg 420 aattcttttggggccagggt tagggaaggc acagccagat tccttatggg aattgaatta 480 cctcaaattcgggtgggtcg tgagatttct agagatttaa cccactgtgg tgccattttt 540 taacaaaaaaaaaaaaaaaa aaaaaaaagg gcggggggga aacccggggc caacgcgggg 600 acccgcgtggggtggggaaa ggtggggtac cgccggcgcc acaaattccc ccaaaatttc 660 atcgcagcacacaaaaaacg aacacaccga acagacacag agacacaacg accacacaga 720 ggacagaacacaaaaggaac acaaacacac acaaagagga gc 762 73 989 DNA Homo sapien 73gctcctcttt gtgtgtgttt gtgttccttt tgtgttctgt cctctgtgtg gtcgttgtgt 60ctctgtgtct gttcggtgtg ttcgtttttt gtgtgctgcg atgaaatttt gggggaattt 120gtggcgccgg cggtacccca cctttcccca ccccacgcgg gtccccgcgt tggccccggg 180tttccccccc gccctttttt tttttttttt ttttttttgt taaaaaatgg caccacagtg 240ggttaaatct ctagaaatct cacgacccac ccgaatttga ggtaattcaa ttcccataag 300gaatctggct gtgccttccc taaccctggc cccaaaagaa ttcaatggtg gttcagaata 360tccacccaag ctatgccttt ctaaacctta ctataaccta cccaaggtaa taatgcaata 420atgcacaatg cccgtgtgtt tcccacccac agtaatttaa ttcttcaatg cttgggtttc 480tagcacattt tcaaagtttg cacgtaaaaa ggttccaaaa caccagttaa tttcttctag 540tctgcacgtg gggcacggta acatatccta aaggagacca gagttttttt ttttttgttt 600gtttgtttgt ttgagatggg ttcttgttct gttgcccagg ttggagtgat gtggcatgat 660cttggctcac tgcaacctct tcctcctggg ttcaagcaat tctcctgcct cagcctcctg 720agtagctggg actacaggtg tgagccacag cgcctggccc gagagttgtc gatatgctcg 780caggaagtat ttctgtgtta aaaagttgag aagatggaaa ctgaatcctc tttgtattca 840gaaggctgtt tcggaaggtc actgttggca ggcttctccc tacagggatt cagcagtgag 900ggagcagagt atttggggga caactgcttc ttctggaggg gcgagaatga gatggagttc 960accagcagct ctttatgtca gacttttag 989 74 1725 DNA Homo sapien misc_feature(83)..(83) a, c, g or t 74 tggtcgcggc cgaggttttt ttttttttta tttttttggcagttttaaaa aggaggattt 60 atttggacaa gttccacttt agncgcaata tattcccctaaaaggaaatc tcacaattac 120 aagtgaaaga tttaaatctc agggccctca gaatttctcattacaaacac ccaagaccaa 180 aatctcctag agatatctcg gtattgtgcg ttcctcanaatttttctccc attataacct 240 ttaaacaaac aaaagccgtg tgggcgttta aatccatgtggtccatatag ccgtgtgttc 300 ccgtgtgtgt gtgaacattg tgtttactcc gccctccacaattctccacc acacacacca 360 ttacgcgagc acagggggaa gtgaaaggtg tagaaaacgtagtcggggga aagatagaaa 420 cgacagagca aacacgcaga gctactagaa gagaaaaatcagagagaaag ataaccatcg 480 cgtcaacgac ctgcaggaga gcagagacat ccagagcgcacgcgcggacg aaatagacga 540 gatatgccat acgggacaca gcgcgacacg agggtatggatgagaccagc ccaactgaaa 600 gcatatgata agagagcgag actgatacgg acgaaaaaggacgcaaacca cctcgcggca 660 cccctgaact aaagacaaaa agaaaggaag aaaaccaaaacatataaaga aagacgagac 720 agacgaggaa caaaaaaaag aaatgaagaa gagagcaagaacgagcagac gataaatgag 780 agaaaacaaa gatagacaga agagagatgg agagagagagagcgaagaag cactatcaac 840 agatagacac tcacgataca gcgatacaga ataagaactaggagacgaac gagagaggac 900 agagaaacga gaggaacaag ctagaaccac aagagagataaaaagaaaga agagctaagt 960 gaacacgcag cgcgggagat ggtagagaca aagaacgcgatgacagaggg aggcgagggg 1020 actacggata gagtgagcgc agcggctatg gataaggaagatagcggata cattgaggga 1080 gggcgcgcgt aggtatcaga ccgcgagagt cattgatcgaataggaggag tacgaaggac 1140 agagagagtg atagagtgat aaaaaatcaa agataagatatacgatagat cgaggatcac 1200 gaaacgaaag agaatgcgag agaagaggga ggtagatagagagactagag agagacagag 1260 ctaagaaaca agaaaagaac aagatagata gcagtagagagcggagcagc gataagaaag 1320 aacaaggacg aaaaagagaa gagcagagag caaagaagtaacgtacgcaa acgagagaca 1380 aacctgaaca gacgcagaag acggacagcg aacaaatcgaacgagtggag gaggaactaa 1440 cacgaaccca tagaagagag caaagcaagc acgcgaagggagaaagcgac gagcaggaga 1500 gacggacacg aggagcgaga tagaggtgta atattcgcgaagtagcgaga agactgaaag 1560 tgaacgggcc gaggaatgaa gttaaagagt cgactagaacgacagaggac gcgaaagagt 1620 aagacatagt cggctcaagg cagtagtgat atagagcgtagagcagagga gagagtataa 1680 tgtggtccag gagcgatgag agcggacgct gagtgcgtagtatat 1725 75 1075 DNA Homo sapien misc_feature (346)..(346) a, c, g ort 75 cgtgcgccgc ggccgaggtc acatcgtatt ctgtgccgag cttcgcacac ctgacgtacc60 gagcatcatg atcgtctccc agcgaccctc acagattctc gggcctgcaa cccctgctat 120tgacgtttga atggaatggt ctgtgtcatg cactcacaga tcgctattac tatcctcgtg 180caatgaaggc caatgtgtgc gaccagatcc ttcctatgct aactcgtaag tagaatcggg 240gtagtaactc gcgaatcacc cttagtatat ggagagacct ctattcatcc acacatgcca 300ctactcgact tggaagaatg gcctttgttg gggtatcccc gcgcgnagtt gccaaagata 360ggtcctattg gggccagttg agagtacgan ttcgagtatc gattcacgac ctagttctat 420tcccgtaagg tagatgggaa acaatataga tttcaatccc cagccacgag caacaatttc 480gcaaacgagc cacaccgata tgggaagcct aaaaccctgt gnntttccca tgtnagtncc 540caacgtttta tgttttttcc ttatttaatg tgtgaagaag ataaaaatta gtccgtgnta 600cttcttaaaa agagagaaag agacaaagag agaaaaaaga aaaaaaaaaa aggcgtgtgt 660gcgcgggtgt acaccccgag tgcgcgtccc aatacgcgtg tggtgtctcc ggtgtgtgtg 720tgtgcacaca tgtgtgtgta tatctcgcgc gcgctccaca aattctccca caccaaaaca 780attttcgttg ttagaacaaa aaattgtaaa aaaaaacaaa aaacaaaaag cacagaacaa 840acaaaaaaaa caaaagaaag aacaaaacac aaaaaaaaag aagaaaaaaa aagaagaaaa 900gaacaaacgg aggaaggaaa gagaagagaa aaagaggaag aagaggaata aaacgaggag 960cagagaaaga gaccaacgca aatgagacgc aaagcacaaa caataagaga caagaggaaa 1020aaaaaaaaag aaagacgcaa agaaagcaaa agcgaacgag gagacagaaa ggcac 1075 76 491DNA Homo sapien 76 ggctgtggtg caggtgtgtg tgcctcaact attgccaatgtgttccacct agctggactt 60 tccttccttc tctaatgcat gtgcagtatg actcccatgaaaatgatgaa ccttgtcatg 120 aagttctcat cgccaacgaa gaacgactgc ataggaagaatatgaagaaa tagctgctaa 180 actgactaag atcgacttca tgtagttgaa gaaatgctctgttcaccgat ggatgccttg 240 ctgtctctat taattgatct aaacctgttg agcagtcagagtcttgcact ggatttagtt 300 tagcgtgccc ataggatgca tcgcatcttg gcttactcttggtcttagct gtttcgctgt 360 gtgaaatcgt tatccgctca cgattccatc acaacatgcggatgcagcac gatatactgc 420 actagataaa tggaccaacc aactaaattc tctcaaccaggctgtagtca gtaaactggc 480 ttaacagaga a 491 77 1440 DNA Homo sapien 77aagaagatcg actactatag gagccatggt tatctagatg catgctcgag cggcgcattg 60tgatggatag cggcgcccgg gcaggtcaat ctttaaattc agtattcagc ttccaaagat 120ggggtgccca taatagactt aaacatataa tgatggctac agaacaaata agtatacgac 180aaatgtaaaa acaggaaatg taagctccac tctcaatctc ataccaaggg tgagagttac 240gagatgctaa agcaaaataa atgtaggttc ttattatatc tatttcctgt atatcatgca 300gtctgcttct tttgagtatg ccttacggag ttacccaatt taagcttacg aggattgtaa 360gtgcaattgg ctgggaactg acaacatgtg atccaagcta ctacacccct gtgctcactc 420tgtcacttct caaattctgc gcgctggaac acattcacaa gaacaacagg gctagagcac 480tgcaaggaaa ccacacacca ccaaactcaa aactcagaaa cacccacatc tccagagagg 540cacagagagg atacaaagaa tactgcgcaa gacaaagaaa tccacagact ccacacccca 600gggcacaacc tggaacgcaa aattcaaaaa actaacgcgg accaaacgcg gaagccccga 660cccaaaagag cacaatataa agaggcccca cgcccacgct gcgcgcacca cacgcacacc 720cgcacggcca acacacaggg ccctgaaaca cacagcacta cacacggggc caacaccaca 780ctactaagca caacccaata caccagcccc ccctcgacgc acacggatac cccgggacca 840acataaaaca cagacaacaa cgactccaca acacccaact aaaacgcgcc aaacaccacc 900cactcacccc acaagcagca gcgacaaaaa caccacccca accaccaaag cgcaaacacg 960ccccccaaaa acgaccattc agagagccgg ataaaaactc aaacaagagc accacaaaca 1020aaagaccaca tagccaaaac acccatggaa atgtattcag caacaccacc taggactcaa 1080aatccccgca gcaacacaac accaaaacac ttacccccca cccaacacac aaacaaataa 1140taccacgaaa aaaccacaca acgcggtggc cacaccccac agcaaacaaa gacacacaaa 1200acacaaaata ccacacacac acagaccaca ccactcaaac aacagcagtc accaacacac 1260acgaccacac acacactacg caccagaact ccaacgcaca aaacacaaca ctactcacaa 1320caacacaacc aacgacacca caataaataa acagacaaac aaaccagacc gaacaccacc 1380acaccaccac accagcgaca actacagaca ccacaaacaa caaaaccaaa caacaaaagt 144078 1653 DNA Homo sapien 78 ttttttattg atcagaattc aggctttatt attgagcaatgaaaacagct aaaacttaat 60 tccaagcatg tgtagttaaa gtttgcaaag tgggatattgttcacaaaac acattcaatg 120 tttaaacact atttatttga agaacaaaat atatttaaaattgtttgctt ctaaaaagcc 180 catttccctc caagtctaaa ctttgtaatt tgatattaagcaatgaagtt attttgtaca 240 atctagttaa acaagcagaa tagcactagg cagaataaaaaattgcacag acgtatgcaa 300 ttttccaaga tagcattctt taaattcagt tttcagcttccaaagattgg ttgcccataa 360 tagacttaaa catataatga tggctaaaaa aaataagtatacgaaaatgt aaaaaaggaa 420 atgtaagtcc actctcaatc tcataaaagg tgagagtaaggatgctaaag caaaataaat 480 gtaggttctt tttttctatt tccgtttatc atgcaatctgcttctttgat atgccttacg 540 gagttaccca atttaagctt acgaggattg taagtgcaattggctgggaa ctgacaacat 600 gtgatccaag ctactacacc cctgtgctca ctctgtcacttctcaaattc tgcgcgctgg 660 aacacattca caagaacaac agggctagag cactgcaaggaaaccacaca ccaccaaact 720 caaaactcag aaacacccac atctccagag aggcacagagaggatacaaa gaatactgcg 780 caagacaaag aaatccacag actccacacc ccagggcacaacctggaacg caaaattcaa 840 aaaactaacg cggaccaaac gcggaagccc cgacccaaaagagcacaata taaagaggcc 900 ccacgcccac gctgcgcgca ccacacgcac acccgcacggccaacacaca gggccctgaa 960 acacacagca ctacacacgg ggccaacacc acactactaagcacaaccca atacaccagc 1020 cccccctcga cgcacacgga taccccggga ccaacataaaacacagacaa caacgactcc 1080 acaacaccca actaaaacgc gccaaacacc acccactcaccccacaagca gcagcgacaa 1140 aaacaccacc ccaaccacca aagcgcaaac acgccccccaaaaacgacca ttcagagagc 1200 cggataaaaa ctcaaacaag agcaccacaa acaaaagaccacatagccaa aacacccatg 1260 gaaatgtatt cagcaacacc acctaggact caaaatccccgcagcaacac aacaccaaaa 1320 cacttacccc ccacccaaca cacaaacaaa taataccacgaaaaaaccac acaacgcggt 1380 ggccacaccc cacagcaaac aaagacacac aaaacacaaaataccacaca cacacagacc 1440 acaccactca aacaacagca gtcaccaaca cacacgaccacacacacact acgcaccaga 1500 actccaacgc acaaaacaca acactactca caacaacacaaccaacgaca ccacaataaa 1560 taaacagaca aacaaaccag accgaacacc accacaccaccacaccagcg acaactacag 1620 acaccacaaa caacaaaacc aaacaacaaa agt 1653 79300 DNA Homo sapien 79 gataatcata tagcgatgtt ggctctaatc atgctcgagcggcgcatgtg atgatcgtgc 60 gcggcgaggt acatacactt atgcacttgg aactgtactgtatcatacgt acaacctctg 120 acacaagctt tttttttttt tttttttttt ttccctattgtaattgatcc attttttttt 180 tgatcaatac aaaaaaattt ccctatttta ataaacccaaaaccttggtt atcatggtca 240 tactgttccc tggtgtgaaa tggttatccg ttcaaaatttccacaaaaaa tacaaaaaac 300 80 486 DNA Homo sapien 80 tttactaagatcctgcattt tattttgtta ttgttgcaaa aagaactcaa tacaaagcca 60 atataaaaaaatcaatactc attttaaaac ataaacagta atttctgaat gtctaacatt 120 ctcctatgcaaagactggga gaaagaggaa gggggagaga gaaaataaat tctttatttt 180 aaacctttcttcaccctgct gggaatgcac atgcccgagc aaatgattcc agcttaaccc 240 cttctggactggtcattgaa gatagggttg gaagaacagt attttagaat ggcgatgaac 300 agtgtcattattaactatat gtacatacac ttatggcact tggaactgta ctgtatccat 360 gacgtagtaacctctgacac aagctttttt tttttttttt ttttttttcc ctattgtaat 420 tgatccattttttttttgat caatacaaaa aaatttccct attttaataa acccaaaacc 480 ttggtt 486 81736 DNA Homo sapien 81 aaggttctag tgattgctga ggagccggtg agcacccagccaggaggcag aaaactgaaa 60 agggcagggc tgaccagtac aggtcctgac agaggacgagaaaaggagag ctcgaagact 120 tggctgcaaa tggactttgg aacgtacaga agatagctggaggaaattca gccagaagtg 180 ggctgtgctg ttcacttggc agcggtcggc gcactgtctaagcaagcagc cagtcaccat 240 gatcttgttt attcaccact ttcactgaga aggacaccagtttatcgtaa cccaatgggc 300 gagaataagt aggaagcgtt acgtaattca gttaaacttgtcttggacga caaatttgga 360 gacttggtct tctagatttc ctgtccagca gatgctattggaaagatgtg aattgcactg 420 agcttgtagc actattcctt ttctgcaaag atagaccatagttaacagtg cgttagtgac 480 acatgactag tgctacccgt ctttggaagc caacttggtccgtcagtcaa gtttgggcaa 540 atctaaagtt agcaaaggat ttctgccctt gaaggcacccataatcgaga aaaaacaaga 600 gaataccact cggaacacag accatataaa gtccggggtgaggaagacac agcgggggcg 660 aaggaagtgc gttccacaca cgtgggggaa gcctatttagagatcccccg tcagaggaaa 720 caggatcgca aaagac 736 82 191 DNA Homo sapien82 ctggcgtgac atctactggt catatgctgt ttccctgtgt gaaacttgtc tactccgctc 60actaatatcc agcacaatca aaggcgagcc aggccatgtg tcccttgaca cagttctaag 120ataaactctt ggtatctctt aacttctagg tggaagacat atacatacag cccattccca 180tgagagggac c 191 83 200 DNA Homo sapien 83 tgaaaatttt aatcgatcacctataggggc gatgggtctc taatctgtcg agcggcgcgg 60 tgtgatggat gcggcgcccggcgcgtctag ttgagagagc tgtttgcctt gttctagaat 120 tcttattttt catttcttttctttcttgta attcttattt ttggtttgcc tggactgttt 180 tgcatactcc aatctttctt200 84 292 DNA Homo sapien misc_feature (173)..(173) a, c, g or t 84tttttttttt ttttttttgg gaactaaaaa agaacttatt aatggagggc aaggggatgc 60aacaatacaa aaatcaaaag ctgggtgtat cagtggctca taggcgtgtt ccccggggtg 120gtgaaattgg tcttactccg cctcacaatt cccacacaac attacgagca agntggggca 180aacgcgaacg aggagggaca caagagagca gcagacgaga cgaaaaaaga aaccaatgaa 240gcggaaagga gaagaaacag aggaagaaag ggaggaagat aaacaagaaa gg 292 85 437 DNAHomo sapien 85 gcgtggtccg gccgaggtcc cccccccctt tttttttttt ttttttttttttctgtggga 60 agggctaatt ttaattaatt ttctgtaagc cttagggtaa aaacaccttaggcggaaatt 120 ttaactattc aaaaaaaagc agttcctacc aattccatgg gtttttaatacctctaacca 180 gatgtgggaa acgcatttaa ctggaaagca aaatatttag agagaaaatacgactattta 240 tccaaattat ataaaatgct tgtacgatag gagaataaat gttgctttccaagggaacag 300 gcacaacact tatttttata gacggcatgt taaaacgctg ggcgtacatctatgtgccat 360 acgcttgttc tcctggttgt ggacaatggt gtatccccgc cccacattccccccacaact 420 tacccgaaca acacgat 437 86 762 DNA Homo sapienmisc_feature (450)..(450) a, c, g or t 86 gcgtggtcgc ggccgaggtccctttttttt tttttttttt tttttttttt ttttttttgg 60 gattttttct ttggccccccttttttattt tccccctctt ggaatttcac aaaggtaaat 120 taaagaagat atttgtaaattaacacagag aatttatctc acaccattat aaaattctat 180 ttctcacaca agggggataaacaaagaaca gggagtgaca cgccaaggct cagagagacc 240 tttttaaaat aaagagtggaggcaaaatac ccccgtgcgg aacacagaga tctcttgtgt 300 ggtccacgtg tgaatatctcaatatcacca cgagacagag acacccacct cgtgtgtgtc 360 cccgctgaga atattatacacaacactcac cactctctat ctcttatata tatagagagg 420 ccgcgcgtga tagagagtgcgtgctgtctn ccctctctag agagatctct ctctatatct 480 ctctatagag agagaggtctctcctctgga gagatatctc tcctctctta tatataagag 540 cgcntggngc gcgtatatctctcgcgtggc gccacatacg cgtgtgtgtc tctcgcgtgt 600 gtgtgtagaa catgtgtgtgtatatctcgc ngnctctcac acatatctct ctcacacaca 660 caacacattt tccgcgagcaccacaacaac taacgtggca ccccacacaa cccaacaccc 720 caacccacac acccacaccacaagcgcaca caaccacaac ac 762 87 476 DNA Homo sapien 87 gggatttaatatatagcgat ggtcttaatc attcgagcgc gcagtgtatg atcgtgtcgc 60 ggcgaggttttttttttttt tttttttttt tttttttttt tttttttttt tttttttttt 120 tcattttttttttttctttt tttttttttt acaaaaaaat tttctttttt tatacccccc 180 aaatttgcttttttgttttt ttttggtaaa tttttttccg tccccaaatc cccacaacaa 240 tcatcaacaaaacatgtcat ggtagatgca gtcccgcctg ctcaccagca cactacgctg 300 tacagctaccaacacgagct cagagagcag gacgaagtac atgcaatcgt agctgactag 360 agagcactgacatgagcgga gtggacgata tcacggtcgc agagcgtagt aaagtcggca 420 agtgagctgaaggacatagg agatagatca gatagtagca cattggtcat atacgt 476 88 842 DNA Homosapien 88 gccggcccgg gccggtacac tgtccaacaa gtataatgcc ttgagaagtctcgtatttca 60 aatcctcttg ggatcgcgca tggcgtgagc tgtggtgcgt acataaagtacgtggtgctg 120 aactacgtgg agttcttcta gtcttggtta ctagtgcgga ctataccactggatcaggtc 180 ttcgatcttt agttcgtggg aacatatgtt aacgagccaa gctacgaagacatggctcgc 240 cagacttgtg ggcaacgcac gggtgcaggt ttgtcagtgc ttattgggcgtgtgtaagta 300 caagcgcaat tcgtagcccg catagacatg caaggacatg gactagaacttgcccaagat 360 gcctacaacg aagagcgagc gtgttaacaa actacgcaat atgcaatgactatggcctca 420 gtagagtaat attgagtagt gcctccatgg gttctagttt aagggcgataacacctagtg 480 tttgaatttc acacattctt aaacagtact aacgttttag agacctagggtacattcttg 540 catggacatg ggtagcgtat ctaaccctag aaataagaac cacgtcactgaagaatagac 600 ctacttccaa ggtaacccat cgttttttag aaaacccgag gatttaaccgcgagagagaa 660 tcctaggagt ctcaaggaag agtttaactt aaagggggtg ggctccgtgggaaaggggtg 720 gtttccccta aacgaattaa tctcagagtt attcccgtgt ttaaatttaacaagtcttcc 780 cattttaagc caagttggca aaaaacacca aaaacaaaca aaaacaaacacaaaaaaaca 840 gt 842 89 1729 DNA Homo sapien 89 acagaattcg gcacgagagactataccact cccataccct ataactttgt ttgttctatt 60 tcacacatat aattttccgagacaagatgt tctcatttaa gcaacaagaa gattcgtctc 120 tcgctattac tgtaactgctgtttatatcg tcatgtcccg gaaaggtccc tgtcttccct 180 gaatggtctc taccaacttcacctccggtt ctaggtgtca tggctgcccc aagagtctag 240 agacgacaac ttctccgcttcctcggcgat ggcggcgtcc gggagcggta tgtcccagaa 300 aacctgggaa ctggccaacaacatgcagga agctcagagt atcgatgaaa tctacaaata 360 cgacaagaaa cagcagcaagaaatcctggc ggcgaagccc tggactaagg atcaccatta 420 ctttaagtac tgcaaaatctcagcattggc tctgctgaag atggtgatgc atgccagatc 480 gggaggcaac ttggaagtgatgggtctgat gctaggaaag gtggatggtg aaaccatgat 540 cattatggac agttttgcttgcctgtggca gggcactgaa acccgagtaa atgctcaggc 600 tgctgcatat gaatacatggctgcatacat agaaaatgca aaacaggttg gccgccttga 660 aaatgcaatc gggtggtatcatagccaccc tggctatggc tgctggcttt ctgggattga 720 tgttagtact cagatgctcaatcagcagtt ccaggaacca tttgtagcag tggtgattga 780 tccaacaaga acaatatccgcaggggaaag tgaatcttgg cgcctttagg acatacccca 840 aagggctaca aacctcctgatgaaggacct tctgagtacc agactattcc acttaataaa 900 atagaagatt tggtgtacactgcaaacaat attatgcctt agaagtctca tatttcaaat 960 cctctttgga tcgcaaattgcttgagctgt tgtggaataa atactgggtg aatacgttga 1020 gttcttctag cttgcttactaatgcagact ataccactgg tcaggtcttt gatctttagt 1080 tcgtgggaac atatgttaacgagccaagct acgaagacat ggctcgccag acttgtgggc 1140 aacgcacggg tgcaggtttgtcagtgctta ttgggcgtgt gtaagtacaa gcgcaattcg 1200 tagcccgcat agacatgcaaggacatggac tagaacttgc ccaagatgcc tacaacgaag 1260 agcgagcgtg ttaacaaactacgcaatatg caatgactat ggcctcagta gagtaatatt 1320 gagtagtgcc tccatgggttctagtttaag ggcgataaca cctagtgttt gaatttcaca 1380 cattcttaaa cagtactaacgttttagaga cctagggtac attcttgcat ggacatgggt 1440 agcgtatcta accctagaaataagaaccac gtcactgaag aatagaccta cttccaaggt 1500 aacccatcgt tttttagaaaacccgaggat ttaaccgcga gagagaatcc taggagtctc 1560 aaggaagagt ttaacttaaagggggtgggc tccgtgggaa aggggtggtt tcccctaaac 1620 gaattaatct cagagttattcccgtgttta aatttaacaa gtcttcccat tttaagccaa 1680 gttggcaaaa aacaccaaaaacaaacaaaa acaaacacaa aaaaacagt 1729 90 1378 DNA Homo sapienmisc_feature (547)..(547) a, c, g or t 90 gcgggccgcc cgggcgggtacccgggccca ggaggacgcc gagcgggcag ccccgtgagc 60 cgcgtggacg tctagcgctgccgtgggatg atgcgctccg ctacgagaag gagctggtcg 120 aagtcgctgg aaggaagtctcgtaatgaag tctcaaggat gtgggacgga tgttgcccgt 180 tctattgacg aaggaagtatggccagagtc cccaggtgtg tgacgccagg tggagccagt 240 ggtgcatcga gaggcaataggggcagagga gtcggggcaa agctgctgtg acatggtcgg 300 ccgatgggag gccatccttgcggatttcgc ttctcttccg tgaagatgct tagtgatagg 360 gggccggtcg caatcgcatctcataccgag tacgattccc agcattcgca cagtcagtag 420 cgtttagccg cgctatggacgggacgcgag agccgtgtcc gtcccccttg gggaggtctc 480 tgggcgtgtg aagggcgaaagctaggagta acaagggtgt atgataggta tatgtgcccc 540 tacttgnaga gggggggaccaatggggggc cttctaatgg tcgcgctggc cgcttgttgc 600 atgatacatc tacagcagcggaaaccggca ttttgacgaa attcactcag acataaaata 660 caataaacgg gaaagtcgaacacgcggcag taaaacctgc ccacgcgcgg ggcgagacca 720 atctccaggg gggcccaacataagtgagga gcgtagcccc taggggcgga gtggatagag 780 ggcgcacgac ggcgcccggcaaacccacca aatttcacac gcgggagaac agataaccgg 840 aggcgacaca caaacccctagagaataaag ggcgacaggg gggcaaagag cgaggatacc 900 caggaggaag aggaaactacaggcacacac caagagagga aaaagtgaac atacacgagg 960 gacgtggcac atggccaaaaaatggacaac gggacggacc cacatatcca taagatgcgg 1020 ctcggatgac cacacacagacccaacatgc ggtaaagacg acaacacacg ggaccggaac 1080 catgactaag gaaagccaccaccgagacaa acagcaaaca caacctatat tcacaccgtg 1140 ggtacacaca gtataggacaaaagaaatcc actacaacaa tatgggtagg agcaccacat 1200 agagtgacaa cgaaaggggattgggatcac aaacccagac aacaacgtct gagaagcaca 1260 tacaggaccc cacacgagagacgcaccaca acaccgacgg aacgtgccct gcaaaggtat 1320 agaaccacgg cggatacaaccggaactcac accatcacca acgccacaca acaaaaat 1378 91 1278 DNA Homo sapienmisc_feature (827)..(827) a, c, g or t 91 gcggccgccc gggcaggtcccccccctttt tttttttttt tttttttttt tttttgtggt 60 ttaaaaaaag tggactttggcttttttcct agtgtgggcg agggtggggc ccgtgcagtg 120 agagggtggt aaggtgtgccccaggggaag tggggtgggc ccgtgtgtgg aaaataatga 180 aacaaaagag ggtcgtggaaagagaaagag gggggtgggt ttgtaagagg cccagtgcgc 240 acaaggtgtg cgcgctcctcagggctcgcc gctctctctg tggaggagtg gagccaccgc 300 tctgtgtgag agagaagagagagtctcagt gtgcgcgggc gcccatatat gctgtgtgcg 360 cgacccacaa atctcaatattataaaaaat cttcgtgtgg agacaatctc tatagcgcgt 420 gtccccactc tccggtgtgtgtgtgtgtgt gtctcccagc acttctcttc tcaacacaag 480 agcgcggctc gagagtgaaaccccccgggt gggtctccct ctgctgcgtg gggtcccctg 540 tgtgtggaga gagggcagcagacaataaga ttcgctgtgt gtaattctcg atgatgaaag 600 cccccgtgcg cggcgtataaacacctgcgc gtggcggcca aaatgaagcg ctgtgtgtcc 660 cgcgccgtgg gtgtgtgcacacatgtgtgt ggttatctcg gcgggctcta cacaaaaatg 720 tctcccacac accacaacttatttgtggcg cgcagcccac acaaagcctc cacaacgcgt 780 cggggttgtt ccttcctgcgccacccaaca acgactgatg cgggccnaca aacgaatcag 840 cctaaaacac aacaaaaaccacagcaccac acactgaccg tagacaccaa ccaaatagac 900 aaccacaaac aaaaacacaacacacaacac accaaccaac gaatcacaaa caaaccacaa 960 caaaacacag acaaacaaccaacagcgaga caagagaacg gagataggca taaggcgcga 1020 tgaacctaag agctctcgtagagaacctgg cacaccaact agcgaataac cgaccgcaat 1080 ggtcacctag taaaagccggaccagaataa ccgatatgag atccaaccca cacaaaaacg 1140 aaagatgata aagatgacaaacgtaaaaat caaataatga gatagacata agacgaaatt 1200 gaagacaaca gactcggtatgagaatacag aaacaatcaa gagtagcaaa gagacagaac 1260 aaacacaaag gaaccccc1278 92 421 DNA Homo sapien 92 cggccgcccg ggccaggttt tttttttttttttttttttt tttttgggga aaaaattttc 60 taaaagggct cttttttttt ttttggggacggtgttaaaa aaaataccgc ggtgttttta 120 aaattattgg cattatgagt tcccatttaacaaattcgtg tgtgttccca aaatatagtt 180 ctctttttac ccagggtctc gtggttaaaatataccaaca cccgggtata aaattttctc 240 tgtgggaatc cttattccat aaaaaatgggccccagggtt tctcacacca ctagtgtgga 300 aaatgttgtg gggtgagatg gagaaatctcactttttatt atatctcaac gccggggggg 360 aaacctcgtg ggccaatagc cgtgttcccgtggtgggaaa gtggttatcc ccgccccaaa 420 t 421 93 544 DNA Homo sapien 93acaaatctta agtataaatt tactgagttc ttgcagacat atacacctgt gtaaccaaac 60ctttccaaaa tttagaccat tgcaatcatc ttcagaaagt ttcctaaatc cctctcccag 120tctatcccct ccccacccct caggtataac tactgttctc attcttttat atcaaaggtt 180aaatttacct gttctaaact tcatatgagt gaaattatac aaaatgtaat gcttctttca 240cttagcataa tgtttttgag atttattcat gttgttgcat gtgtcagtga ttcatttctt 300tttattgctg agtattcttt cgtatgaata tatcacagtt tgttttttta tctttctgtt 360gatggacact gggctctttc tgcttgtttt ttactgttat gaataaagct gctatgaaca 420ttcttagaca aaaaaaaaca acaacaaaac aaaaaagtcg gggggaaacc ggggaaaaag 480gggacccggg ggggaattgg ttccccggcc aaattccccc caattttgcg acacaaaaga 540caac 544 94 5631 DNA Homo sapien 94 gcggccgagc ctcctgcggg gatgcggttttaacgcgctg gggcctccct ttatagggtc 60 acatgtgatg gggaggagga tacaagcccggaatctggga taggtggtaa tagggccacc 120 agctgctttt agttttacgc ttcttatgtcttacctggaa acaattattc taaaagttgg 180 tcacctcatt aagccaaaac atgagggcagagtggaggaa ggatgcaaaa agttttaaaa 240 taatatagga aagtactaaa tttaagaatgtctggacaac tcattccagg tcacctatag 300 cctatgagag aggaagaata tattttgacaattatcggcg ctgtgtcagc agtgttgcat 360 ctgagccaag aaaactttat gaaatgccaaaatgttccaa atcagaaaaa atagaggatg 420 ctttattatg ggaatgccca gtgggagatatacttcccaa ttcatcagat tataagtcct 480 cactcatagc actgactgct cataattggctacttcgtat atcagcaact acgggaaaaa 540 tccttgagaa aatatatctt gcaccttattgcaaattcag atacttgagc tgggacactc 600 ctcaagaagt cattgcagtt aagtcagctcagaacagagg ctcagcagtg gcccggcagg 660 caggcattca acaacatgtt ttgctgtaccttgcagtgtt ccgagttcta cctttttcac 720 ttgtagggat tctagagatc aacaaaaagatttttgggaa cgttacagat gctaccttgt 780 ctcatggaat actgattgtg atgtacagctcaggactggt cagactctat agcttccaaa 840 ccatcgctga acagacatgc caccactgctctttgaggtg tcatccctgg agaatgcttt 900 tcagattgga ggccatcctt ggcactacatcgtcacacct aataagaaga aacagaaagg 960 agttttccat atttgtgccc taaaagacaattccctggca aaaaatggga tccaagaaat 1020 ggattgttgt tctctagaat ctgactggatctatttccat cctgatgctt ctggtagaat 1080 aatacatgtt ggtccaaatc aagtcaaagttttgaagcta actgaaatag aaaataatag 1140 ttctcagcat cagatctctg aagattttgtcattttggcc aacagggaga accataaaaa 1200 tgaaaatgta ctcactgtta cagcttctggacgggtggta aaaaaaagtt ttaaccttct 1260 ggatgatgac ccagaacaag agactttcaaaattgtggac tatgaagatg agttagattt 1320 gctttctgtg gtagctgtta ctcaaatagatgctgaagga aaagctcacc tggatttcca 1380 ctgtaatgaa tatggaactt tacttaaaagcattccacta gtggagtcat gggatgtgac 1440 atatagccat gaagtctact ttgacagagacttggtgcta cacatagagc agaaacccaa 1500 cagagtcttc agctgctatg tttaccagatgatatgtgac actggggaag aagaagaaac 1560 cataaacaga agctgttaaa aagagtgagataattgtaac ctaagagact tttagccaaa 1620 caccccagca gctgcgtcca atccattttattatctgcat ggcacattct ccagtatttt 1680 ccaaaaaagt cttgtgttga cttcagatgactatgacttc ttttttaaac tcttgctgta 1740 aaagatggtg aggacttcat tttttttaaaggttttttag aatactgttc caagaagttt 1800 agtgttttgc agctttgagc taggtggtaatgcaaatata aaatgctggg aacagaaaag 1860 gacaggttaa ttccaattgt tgaggagttaagtcattgat ggggtgggtc attgatgagt 1920 tcttaaagga tggtatggaa ttttgtttgttaaggctagg aaagacaggg agagacaaaa 1980 gtaaacatgc agaaagaaat cttatatcctctataccaaa ctttgcttaa ggatgagaaa 2040 tgagatgtgt tatgtgagaa cattattttgagcccaaaat gtgtcatcac agtttttaaa 2100 aatcttatat atgtatttat atgtgtttcgtatttgtata tagtatcagg aattggttct 2160 agttcccaaa ttatcttttc ttccttggttttgttctctt ggcttgatgt tcacattgaa 2220 tatttgtgtt tctatatagg ctaatgtaaaagattccaag caaaccttaa gtgaaattgt 2280 tttctgattt gcatcctgtt tagctcttaatgtatctaag gatgttctca tctcaccatt 2340 ctactcattt agtgagtttt ctgatcttgtttaggcaata tttgcatact tatgcaataa 2400 gataaaggta cccttgcctg cagtagttctgtttcctgta gaaaagtgga taaagagtcc 2460 cagaagaagt tcttactagc ttgggagttacctgattaac cagagaaaat ttttggctta 2520 cttatggaac aagcattatt tcttctttgttaggaaagat ctaaatatgg tccttgactt 2580 ttaataatca ttctttagaa tgttaaataaaggcaaccca agtaaaggga gaaaatgttt 2640 ctttgtgctt cctgtttgag aaattcagttgcttccattt cgcatgttct gcacatttat 2700 ccgatgtaac ctcaaaagaa taactggtaataagggaagg aaacagcagc aacaatcatt 2760 gctgattcaa gtttaaggtt aaaatatggaatttttagct tggatgattt atattaaaat 2820 ctttccattt ttttttttca gttttggcttgatgccatgt taagaatgat gtgaattctt 2880 cccagttctg ccctggtgct agacattgccccatactttc aattagacac tagctgtatc 2940 taaatagtcc cactcagtaa acttacatcttgaaaaacaa gaccagtaag aggccagtga 3000 aagtactaaa gaaagaaacc aatgttgtgtgagtttcaaa gcagctgcaa tgctgtgtaa 3060 aagtagagtg ttcattctcc atttccaagagtgtttcaga ataggatgtc ttaagacttc 3120 agtcatgtca gagatttttt tttttaggtgattattgagt ttctccttct cctttaagtc 3180 atcaccttcc ttttatgaaa tgatagtaaggaactcgtct attctgaaag gcatttgaga 3240 aatagctgaa ttcctggctg cttttttgctgggggtagat ggtggaatac ttctggtcta 3300 gatataactt accactaaga aacccccagtatgtcaccac tgcctaaatc taactagacc 3360 agggtccaaa tgccatccag gccaggcaggaaatatacct catgtgaaag acagtaagga 3420 gttgtgggca gtgtaacaaa caggagagctatgccccaac taaaaggagc agctgctact 3480 gcttagtttc agccagttgc aacagtatgtgggaatgtag gctgcatggt tgttaacaag 3540 atagatggta aaaagatgcc agaagatacagaagatagca aagaatgtgg ggaatttgga 3600 taccacacat agcgagagac aatgaagcatgcttcccagc tcgccagagt gtcacacagc 3660 tgctcattct gccacctgcc agacattaatgtcttcctgc cctacctaaa ccccctcttt 3720 acctgatatt ttaattcgag actctagctacatgcccacc tacttaacag gtactagtga 3780 caggtacaaa acattatggg taacaattctgagtgtttaa tgcaagccca ggtgaagcag 3840 ggtagcttcc atcagcaggt acagacgttacgctgaaaag aggtgcattc tgcattgcac 3900 tcctggatct aagtttctgt attctcagagcatcaatgca gcaagcttat tgttcctcaa 3960 ttttttacaa tatttatcac aactctgggagaaaacaaaa caaaacctat cctatttact 4020 atttgtgcta cctagtgagg agataccgctctgtttagac aaattaaggc acttcacatt 4080 cttccaccaa ttgaaagttt tgtatcttacagttcttttt ttaaataata tatttattga 4140 gcactttcta tctactagtc actgtgatacagtataagta aagtgggttg tctcatttaa 4200 tattcagaat aaccacatga agtatgaactgccattatct ttcccctttg tacaaatgag 4260 gaaagtgagg ctcacagaag ttaattggcccagggtccca caactagtca gtgcagaggt 4320 gggaaacata accagatttg ttcggcatgaacttgtgcca aatttcctcc aaagctctca 4380 aaaggcaagg catgttattt tatcccaatttagcatacca acaactataa tactagatat 4440 gtaggaaagt gcttaataat cgttttttactgatgattca gtgtctaaat tttgaacaaa 4500 tttgggtaag atacaagtca cacataaattgacagaaaat gtagttcttc attcaatggt 4560 tagcagtcat taaaaggtac tttccctttgtttgtggtga taatcagtat tagtagtttt 4620 catattattt ggcttccata ttaatcatttttatattttc ttctccttct taccatgttt 4680 acttatatca tccatctttt agaatcccagggagctaatt tctggtccct gtgttgctat 4740 caaatctgta tcttgcagaa agaataatttatttcaaaca agggacatac aatagaaaga 4800 taagacctac tgaggtcttt ttcccatcattttattatga aaaatgttca aacatacagt 4860 aaaattgaaa gaattttata gtaaatactgaccacgggga ttctacatct tactctactt 4920 gttttattat tttcctatcc agcgtactttttgatggatt tcaaaataaa ttgcagttgc 4980 tgatatactt ccccctagta cttcaactgcagattattaa ctagagttta gtatttattt 5040 agtttttaaa tttttttgat ttaagatttacctgcaataa aatgtacaaa tcttaagtat 5100 aaatttactg agttcttgca gacatatacacctgtgtaac ccaaaccctt tccaaaattt 5160 agaccattgc aatcatcttc agaaagtttcctaaatccct ctcccagtct atcccctccc 5220 cacccctcag gtataactac tgttctcattcttttatatc aaaggttaaa tttacctgtt 5280 ctaaacttca tatgagtgaa attatacaaaatgtaatgct tctttcactt agcataatgt 5340 ttttgagatt tattcatgtt gttgcatgtgtcagtgattc atttcttttt attgctgagt 5400 attctttcgt atgaatatat cacagtttgtttttttatct ttctgttgat ggacactggg 5460 ctctttctgc ttgtttttta ctgttatgaataaagctgct atgaacattc ttagacaaaa 5520 aaaaaaaaaa aaaaaacaaa aaagtcggggggaaaccggg gaaaaagggg acccgggggg 5580 gaattggttc cccggccaaa ttccccccaattttgcgaca caaaagacaa c 5631 95 96 DNA Homo sapien 95 aggtaggaggggttagagga gggaggactg acgtgaggaa gaggaggacc attcggacaa 60 tgtattaggacactctcacc aagctggggt atcatg 96 96 495 DNA Homo sapien 96 gcgtggtcgcggccgaggtt ggaaaaaaaa tttttttttt tttttttttt ttttggaaac 60 aaagtctaattctgtcgccc aggttggagt gcggggtgtg atcttggtca ttgaacctcc 120 gcctttgtgggttaaaaaaa ttctcctgcc tcagcctcct gagtagctgg gattataggc 180 acacaccaccaaacccagtc taatcttttg gtattttttt tatagtagag acggggtttc 240 gccatgtcggccaggcgttg tcttgaactc cgtgacctcg gggggctcca cccgcctcgg 300 ccccccaaaggtgcgggcat tacgggcgtg aaccaccggt gccgggccgg aagtctttaa 360 aaaaataagggtgatactac atcttcaaag actgggggat aactcagggc ccatagctgg 420 ttcccgggtgtgaacatttt tactccgcct cacaatcccc cacaaatact cgaaaaatgc 480 ggaaaaaaaaaaaga 495 97 1374 DNA Homo sapien 97 cggccgggca ggtaccatgg caatgaagccggatgtctgc ccttcctgca accagctccc 60 tcaggagtgt tagcttcaag ggtgactggccggatgagca tgctggagag catcagtgga 120 gaggcgatgt gggaacccgg tctgcctgacctttctattt ctgaatccag acccatcgtg 180 caccatggct acgtgtacca atctggtaaaacttctcatt gtcactccca gtgaaacgcc 240 cagtagtatc acgaacacat ttccgactcccactgtagtc atagaaatcc gtggcatcca 300 actatagaag tgggtgtcta caacgtgatacatcgtaggt ttaattaaat agaaataaaa 360 taaaacaaga caaagacaga taacaggcatgagggagaga aagccgtgcg ggagccaaac 420 aagacgagat taaggcccga ggggagagcaaacagtgagg gtcacacgcc cgggcgccaa 480 caagagtacc acagagcaaa agagaagataacaaaataca aagggggaac aaatgacgaa 540 aagagaaaaa cacaatgtag gctagaaaaaacagaggaaa aaatagcaag aaatacaaac 600 aaaaaattat cccaaagcaa acaaaagcaaatagaagaga aggcaaagaa gaggaaagaa 660 aaaagacacc aagaaggaaa gaaaagaaaggaaaaaaaga aagaaagaca caaggacaaa 720 aaacagaaaa aaaaaaaaag aagaaaaaaaaacagcaagg agaaaccaga acaggaagcg 780 aagacagaag acagacgggc gaggcaaagcaagaacagaa cacagaaaag acccacacga 840 gaaacagaga aagcgaaacg tacagacagaaggaacgaac aggaacgaga cagagacgca 900 aaagcagaag aaagacacac aagaacaggaacaaaacaga ccgaaagaca aggagccacg 960 aacggagaaa gaaaagacga ggaagagagcgagcggagag gaagaagagc aggagagacg 1020 agagaagaaa ccgacgagag agagagggcggagcggcgga gaacaaaaga agacgcagcg 1080 agccgcacac gacgaaccac gcacacggagaaacgcaacg cagcaacagc gcgcgacgac 1140 aaacggaaga cgaaagagaa aaacaagcacacaaagcaga caacgcaaga acgacccata 1200 cgaccgacga cagaaccaag cgcagagacaacagcaaaag aaggagacag gaaacgcgag 1260 caacagcaac acggaacgac gacgacaaaaacgaacacac ggaggcgcaa gcgcagaacg 1320 aaagaaacaa cgaacgagaa cacaacagaaacacacgaag cacacacaca acgg 1374 98 1713 DNA Homo sapien 98 ggaacaaatgtcatgccagt gggagttcaa gtgccagcat ggagaagagg agtgcaaatt 60 caacaaggtggaggcctgcg tgcttggatg aacttgacat ggagctagcc ttcctgacca 120 ttgtctgcatggaagagttt gaggacatgg agagaagtct gccactatgc ctgcagctct 180 acgccccagggctgtcgcca gacactatca tggagtgtgc aatgggggac cgcggcatgc 240 agctcatgcacgccaacgcc cagcggacag atgctctcca gccaccacac gagtatgtgc 300 cctgggtcaccgtcaatggg aaacccttgg aagatcagac ccagctcctt acccttgtct 360 gccagttgtaccagggcaag aagccggatg tctgcccttc ctcaaccagc tccctcagga 420 gtgtttgcttcaagtgatgg ccggtgagct gcggagagct catggaaggc gagtgggaac 480 ccggctgcctgccttttttt ctgatccaga cccatcgtgc accatggcta cgtgtaccaa 540 tctggtaaaacttctcattg tcactcccag tgaaacgccc agtagtatca cgaacacatt 600 tccgactcccactgtagtca tagaaatccg tggcatccaa ctatagaagt gggtgtctac 660 aacgtgatacatcgtaggtt taattaaata gaaataaaat aaaacaagac aaagacagat 720 aacaggcatgagggagagaa agccgtgcgg gagccaaaca agacgagatt aaggcccgag 780 gggagagcaaacagtgaggg tcacacgccc gggcgccaac aagagtacca cagagcaaaa 840 gagaagataacaaaatacaa agggggaaca aatgacgaaa agagaaaaac acaatgtagg 900 ctagaaaaaacagaggaaaa aatagcaaga aatacaaaca aaaaattatc ccaaagcaaa 960 caaaagcaaatagaagagaa ggcaaagaag aggaaagaaa aaagacacca agaaggaaag 1020 aaaagaaaggaaaaaaagaa agaaagacac aaggacaaaa aacagaaaaa aaaaaaaaga 1080 agaaaaaaaaacagcaagga gaaaccagaa caggaagcga agacagaaga cagacgggcg 1140 aggcaaagcaagaacagaac acagaaaaga cccacacgag aaacagagaa agcgaaacgt 1200 acagacagaaggaacgaaca ggaacgagac agagacgcaa aagcagaaga aagacacaca 1260 agaacaggaacaaaacagac cgaaagacaa ggagccacga acggagaaag aaaagacgag 1320 gaagagagcgagcggagagg aagaagagca ggagagacga gagaagaaac cgacgagaga 1380 gagagggcggagcggcggag aacaaaagaa gacgcagcga gccgcacacg acgaaccacg 1440 cacacggagaaacgcaacgc agcaacagcg cgcgacgaca aacggaagac gaaagagaaa 1500 aacaagcacacaaagcagac aacgcaagaa cgacccatac gaccgacgac agaaccaagc 1560 gcagagacaacagcaaaaga aggagacagg aaacgcgagc aacagcaaca cggaacgacg 1620 acgacaaaaacgaacacacg gaggcgcaag cgcagaacga aagaaacaac gaacgagaac 1680 acaacagaaacacacgaagc acacacacaa cgg 1713 99 1448 DNA Homo sapien 99 tggtcgcggccgagcgtact tttttcccaa acatgtgtgt atatttcgta agcagttaga 60 acatattactagacttgact ctgacgaact tcaccctctg aaaattcctt gacaccactt 120 cctaactttacatacgtgct catggcttac acataaacat ctactaaaga cggcacttct 180 ctatcctctatactgcaacg cctaacctcc agattccgac tctagcgcta acctaacgtc 240 tcaataccttgctccatacc ttgctcctct tgcttcctca ctttcctcta attctcttca 300 tattctcttaacacaacctc aagagtacta ttctcttaac ggcacacgaa cgctaacgcg 360 cacagcatctgccttgccac gaaaatgcct tcagacagaa tgcatctctt catcttaaaa 420 atggcttcccttaggcaccc cacgggacaa ccttgcaagc tcaaatctca gggcgctcac 480 tgcacacaactctcccacgc tctcactacg gcttctctac aactccttac tctgggctac 540 aactcttcaaacattaacgg cttttctctt caacattgca ccttacaaaa cattgaacaa 600 ggcttctctctctagaaccc acaacaacac actacacaca cacacatacc acaccacacc 660 acacactagacaagacagcc gactactcgc tgcggagccg gaacaacact cctcatacag 720 acgcgcgcaccatacacgtc ggcgtgtgta tcaccacccc aaggcgcggt gtgcagcacc 780 acaccgtcgcggggatcatc acatcacggg gacatcacca acagaagaag attcccgccc 840 acagagaacaaaacagtcta ggtgcacaag tcaaaaagat gtagggtcgt tacacgtaag 900 catcgatagtgctcgcgacg tagaggttac cgagtgctgg gcacagcgga tggtgacagt 960 gagtgtataatgattaggat ggcgccacgg tcgacaagat tgtgttgatg gcgtgctcgc 1020 gtgttcgtgactatctatcg tttgatgtga tctgccagtt gactattatt agtactttcg 1080 cgatgagtgcgggcgtcgcg gatggacgtg cgccgtagcc gacgcggagc agctgagtgc 1140 agaggcgcgtctgagccaag taaataatgc aaggggcaag gtcgggcgga aggcgcggtg 1200 cgcggtggggaagagtgagc agaggtgacg aggcggagag gggaccgacc tgtgatggga 1260 gggcgagcgggaggtaggag gaagcatgga ccagtagtag atgtgcgagg agaggtgtgg 1320 tgagcgagcagaaggaagcg cgacacaaaa gtgcgagagg acggagacga ggacagatat 1380 ggacgcctggagagaggagt caacgagagg gcacgctaag cggcgagagc ggtcgaggcg 1440 aacgaaac1448 100 1786 DNA Homo sapien 100 atttaataga ctatataggg atttgcctgcggcaagaatt cggcacgagg gatgccaaag 60 aagaccatga aagaacacat caaatggtcttactgagaaa gctttgtctg ccaatgttgt 120 gttttctgct tcatacgata ttgcacagtactggtcagta tcaggaatgc ctacagttag 180 cagatatggt atcctctgag ggccacaaactgtacctggt aagttctaga gccttgtagt 240 tttaaatttt aatgatttga tatgctctgtagtaatatta attttgtgaa gatgtgttta 300 catttgtaat tgctcttgga ttttcttaaagtaatagtcc agttttaatg ttttaatgtt 360 tgtacttttt tcccaaaatg tgtgtatatttgtaagcagt tagaaatata tagactgact 420 ctgagaattc accctctgaa aattccttgacaccacttcc taactttaca tacgtgctca 480 tggcttacac ataaacatct actaaagacggcacttctct atcctctata ctgcaacgcc 540 taacctccag attccgactc tagcgctaacctaacgtctc aataccttgc tccatacctt 600 gctcctcttg cttcctcact ttcctctaattctcttcata ttctcttaac acaacctcaa 660 gagtactatt ctcttaacgg cacacgaacgctaacgcgca cagcatctgc cttgccacga 720 aaatgccttc agacagaatg catctcttcatcttaaaaat ggcttccctt aggcacccca 780 cgggacaacc ttgcaagctc aaatctcagggcgctcactg cacacaactc tcccacgctc 840 tcactacggc ttctctacaa ctccttactctgggctacaa ctcttcaaac attaacggct 900 tttctcttca acattgcacc ttacaaaacattgaacaagg cttctctctc tagaacccac 960 aacaacacac tacacacaca cacataccacaccacaccac acactagaca agacagccga 1020 ctactcgctg cggagccgga acaacactcctcatacagac gcgcgcacca tacacgtcgg 1080 cgtgtgtatc accaccccaa ggcgcggtgtgcagcaccac accgtcgcgg ggatcatcac 1140 atcacgggga catcaccaac agaagaagattcccgcccac agagaacaaa acagtctagg 1200 tgcacaagtc aaaaagatgt agggtcgttacacgtaagca tcgatagtgc tcgcgacgta 1260 gaggttaccg agtgctgggc acagcggatggtgacagtga gtgtataatg attaggatgg 1320 cgccacggtc gacaagattg tgttgatggcgtgctcgcgt gttcgtgact atctatcgtt 1380 tgatgtgatc tgccagttga ctattattagtactttcgcg atgagtgcgg gcgtcgcgga 1440 tggacgtgcg ccgtagccga cgcggagcagctgagtgcag aggcgcgtct gagccaagta 1500 aataatgcaa ggggcaaggt cgggcggaaggcgcggtgcg cggtggggaa gagtgagcag 1560 aggtgacgag gcggagaggg gaccgacctgtgatgggagg gcgagcggga ggtaggagga 1620 agcatggacc agtagtagat gtgcgaggagaggtgtggtg agcgagcaga aggaagcgcg 1680 acacaaaagt gcgagaggac ggagacgaggacagatatgg acgcctggag agaggagtca 1740 acgagagggc acgctaagcg gcgagagcggtcgaggcgaa cgaaac 1786 101 467 DNA Homo sapien 101 taaacctgag caaggcgtgagtggggcaaa catctgggga tggatagata gagatggcta 60 aactcaaatc aattagctgagagatcactg cccgatgtac agtctgggag cctcgtgtgc 120 ctgatgaaat atgatcggcaacgcgcagag aaggaaaggt gcgtataggg agctctaacg 180 aaatatccaa tcaagaatccacgagagaac tacaacatag aagaataaaa gaaaggaaaa 240 aagagataga agagaaaaaaaagaaaaaac aaacatataa aaataaaaat cgagcagaga 300 aaaaacagag aaaaccaacaaaaagatcaa gaaagagaac aagacaagaa aaaagaacaa 360 ttgtagaaac aaaaaggcaaaagaaaaaag accaagagaa aaagaaaaaa aaggaccagg 420 agtattagaa agaaagataaaaacaacaaa ggacaccaga cacaact 467 102 103 DNA Homo sapien 102 tcccggactcattcgttgag tcaaccggta gacacgagtt atacgcactg gaggagcatt 60 gagaataatggacaagtgta ccgggtagag tatcgtccat caa 103 103 724 DNA Homo sapien 103gagcggccgc cgggcaggta gagacaggtc tctctctctt gcctagctgg gagtgcagtg 60gagtgatcat agctcactga ggcttgatac tcctgggctc gagcaatcca cctcagcctc 120cagaagtagg ggagactaca tgatgtgtgc caccatactc agctaatttt taaactttcg 180tatagacagg gtctccctgt gtagcccagg ctggcctcga actcctgacc tcaaaaaatc 240ttcctgcctt ggcctcccaa agcactggga ttataggtgt gagccatggc gcctggtcat 300aaattctatg ttatttgttg tttgttcgtt tattagagat ggaatctctc tctcttgacc 360aggctagagg gctgtggtgc gatctcagcc cagctgcaac ctctatctcc tgagctcaag 420cgatcctcct tagcttccca aatagctgga actacaggca tgtgccatca cgtccagcta 480attttgtatt tttagtagag aaggttttac catgttggac agggtggtct cgaactcctg 540gcctaaagtg gtccacctag ctcagcctac caaagtgctg tgattacagg cgtgagccac 600catgcccagc ctctaaattc tgttttctat tcaaagtaaa aatgacatgt gtttgagtca 660aaaaaaaaaa aaaaaaaggt gggggtcggg aaagggcccg gggaattgtt cccggccatt 720caat 724 104 734 DNA Homo sapien 104 gagcggccgc ccgggcaggt cccccccctttttttttttt tttttttttt ttttttggtt 60 taaaaaaatt tgcctttgtt tttttctcttttggcggggg ggccctgctt gagggggtat 120 ggggccccgg ggaatggggg ggctggggaaataatccaca aaggtgttga aagagaaggg 180 gggtgtgttt tagaagcgcc aggcgccagggtgtcctcgt gtgccccgtt cttggggagg 240 ggacgcgcct ttgaggaagg gatttcttgggctaacgcct atatgtctgg cgcccccatt 300 ccatgtatta aaaatttctc tgagaaaatttcttatgctg gccctattcg ttgttggtgg 360 ttgcccatgt tccttccaat acatgcgcggtcaaggggac ccccgagggc ccttctgcgg 420 tcccttgtgg aagaaggggc gaagatatgtctcttgttta atcactagta taaagccggt 480 ggcgtgcata tcactcaagt gtgccatatatgccgggtct tctgggggtg tgatatatgt 540 gtgggcacct ccccgcgctc caaacacacccctttactaa caattccgtc gcgtgcacca 600 acagggcgtt tttatcggag ggcagacggagataagcgga ggataaaagg agatcaaaac 660 aagaaaggaa ggaaccgcaa aaaaacaaaaaaacaaacaa caaaaaaaaa aaaataagaa 720 gagcgagatg gagc 734 105 648 DNAHomo sapien 105 aggatccgcc cgggcaggtc cgggcaggta cctactaggc agttgggttcagggaaatag 60 ggattagact atggcctatc aggctcctat atggtcataa tatttaaaatatagggagta 120 gaaaacaaca aagaataggg ataggactac ttaaaaacaa tagaaagagcatatatatac 180 gtatatagta cccgtatgaa tagtagaata tatagtatat tatagatatatcataaatat 240 actagctagg taacaatagt agacgagtta aacaataggt agcatataatagtaatataa 300 taatataata aatattacag aaataacgca ttattataaa tatattactaatataccagg 360 gtagacataa atagcattta aatattaggg atattagggt aggagtaattaatagtatta 420 actaggagta tagtacaacg taaaatgaag gtcccccatc agcggaaaaaaaacaaaaac 480 acaaaaaaaa gaaaaaaaaa aaaaaaggtt ttgtgggggg gttatactacgtgtgggcat 540 aatataggtg tgttaccggg tgtgtgttgt gctagaacaa catggtgttgtgtaataatt 600 acggggggct tctcagacaa attttttcgc gacaaaaaaa atttatag 648106 580 DNA Homo sapien 106 ttagactcat atagcgaatg tgtccattaa tcatgctcgagcggcgcgtg tgtgatggat 60 gtcgcggcga ggtactagaa gtgaagcacc tttatttagcaataattaca aagagttgct 120 taagattgat gcagataaat cattcatgaa actagaacaagattatgaac tacattagta 180 agttccttca ttcagcaatt tatgccaaag atacactttccctgacttca cttttccctg 240 ccttgagata aaatgaggat aacagtggct atttcttagggttgctataa agattaaatg 300 agctgatact tgtaaagtat gtaaaagaag gcctgacatattatcagttt ccattgacat 360 ttctactttc aaggaacttg taatatagtt agggaggtaacatatgcaca taaacatcta 420 aataaagatt ctcagtaaat gcccaagtaa gcaattctgtaatgtataaa aaaaaaaaaa 480 aaaaaaaaaa aaaaaaaatt tttgtttttc tctttggtctctttcttttt ccctttttaa 540 atttttttcc ccctcccaat ttcccccaaa aatttgacaa580 107 1634 DNA Homo sapien misc_feature (369)..(369) a, c, g or t 107tcgcggccgg aggtaccctg tggcccagga ggacgtccgt acttccagcc ccgacaccct 60ggacatctac tctgcccgtg tgatgactgc ctcccattga tcatggaccc tgctgtcgag 120gtcgctggca atgaagttgc ttgaatacaa aatgactcca aggtacgtta atacgttgct 180tacccacttc taattgagac agaataggta atggcccaac ggtccccaat gaagctgatc 240gcccgcgcat gacgcatgtg ttgcaggatg taggggtggg ggcgatagtg tatcgggggt 300aagtctgtcg tgatcatgct cctccctgaa ggcatatccc ttcgcgtgca gaatctccgt 360gtcgctccnt tgtgacgtgc ctctctgaca ggcgtcctct catttcacat cgtccctatg 420gacagtaccc acccaactac cccgtcataa ccttgtttcc tcgctccatg caagtgaatc 480ccaaccgaac cctattgcca tacacgcttg gagcacgcta ttataggggc ttgttgaatg 540acgataccta ggaaggtaaa agacgttctt ctataatatc tatccacttg cgtgcatatg 600ggggcaaaga gagagagcct acttatttct acacccttga ccttgggtct attgccaaat 660ttaccaaact caaagaaacg tatacccaac gaatattatg taattggggg ttaccaaata 720catatacacc aaggaccata tattatataa acccaaagaa aaaactgagc ccggaagtgc 780aacgtcctta tcttacccgt gtgccaagtc ctattcccag gtcgtaatat caccccgttt 840ttgtgtgtgt agcccaaact gtggcttagt actacccccc ggttgtccac gcgacaaaat 900tcccccacac attaaacaaa gagaagtgtc gctcctatat tattacacac acacccaggg 960ggcccggccc ttatataaat tttttggggg cgccccttgg ggaaatttgg ggttctccac 1020acggggaact taaaatttcc aaccccttta gaaaacgcca acggtttgga cacaggcccc 1080caaggcgtag actttacccc caacttttcg ctgtgtacgg gctggtctcc aactttatct 1140tttttggcct agggttcaca cccccaagga ccaaaaccgg gtttgagccc caacctgaaa 1200cggaaaaacc atttcccttt tggaacacag ggaaccaaca atttccttta gaaaccaatt 1260ttggaaaaag gcccccaaac actgtggttt cccccccagt gggcaaacag cacgcctttt 1320cccacttcca aaaaaggcct ttggagagcc ggttaaaact ccaatagggg ttcccaacta 1380aaggcttggc tttacccctt ggttgttttg acacactttg gtgtaatccg ccggctccca 1440caaattccca cacactcacc tcagatgaca tgcgagagca cagctgctcg cgcaaggagc 1500gagaaggtca actacacagc cgtaacactc atcagacgcg ccaggcgaca cacgagcaac 1560aacactgcgc gaacgagcac cacaatcagc gacgacgaga ctgaacgcag cgaccaacac 1620gctacgcagc agag 1634 108 697 DNA Homo sapien 108 ctgatgcggc ctgcccgggcaggtcccccc cccttttttt tttttttttt tttttttttt 60 tttgtttaaa aaaattgaactttcgttttt ttccttttgg gcggtggtgg gcgcccgatt 120 aggggtggtg ggtgtccccagggaaggggt gtggcgctgg gagaacatat gatctagcca 180 gaaaagttgc ttgagaatgagaacgtggtg gtgtcgtgtt ttagaagtgc gccatgtggc 240 caagggtggc gctcccctcaggctccgctt tctttggaga agtgtgagcc ccgcctgtag 300 agagaaggag atctcattagcgcaaacgca caatacgcgt atgcgcacac acaatctcaa 360 agattataaa agaaaatctcttcgtagaaa caatctccta cgcgcggccg cccactctca 420 cgtcttgtgg cgtgtgtctcccacgtattc tcatcatcat accatgtgcg cggtcacaag 480 gtgtacaccc cgaggggttctcccttctcg tgggtcttcc cgtgtgtgtg tgaagagggc 540 cactcataga ttccggtgtatactctacag tgaagacctg tgggtgttta tacactcggt 600 ggtctcaata ccctttgtccccgtgggtgt gaaaatttgg ttacccgccc tcacaattct 660 ccccacaact tgcggaacaaaagatacacc gctgttt 697 109 581 DNA Homo sapien misc_feature (487)..(487)a, c, g or t 109 gcgagtgtgg cctctaatgc atgctcgagc ggcgcagtgt gatggattcgcggcgaggtc 60 cccccctttt tttttttttt tttttttttt ttttttaggg agggggaaaaattttttttt 120 tttttttatt ttctcccaaa aacccttttt ttggaaaaaa ttaaaagttgcaatgagggg 180 ttttctttat aaaaaaaata ttaaaactag gggcatccta ttattccaaaaaaagtttaa 240 tttgctattt gttgacaaag cacatcacga gtgggtgtat aagctggtcctctcttatat 300 ttttcagaga aaatattatt ctcacagtgt ccatgtctat tccatcaccgtgtgttcaag 360 gagaaaatct cgtcgggcgt gtaactcact tggggtgcat aagtgtgtgttacctctgtg 420 tgaaatattg tgttttatcc ccgttccaca atattcccac aacaacatctaaccagaaca 480 cacacgngtc tgcagcaaga ggcgggcgcg cagaggacaa gagacgggacaacgagcaag 540 agaccaagca gcggaggcaa gagaaggaga gagggaagta c 581 110 862DNA Homo sapien 110 ccgcccgggc aggtccctcc tccttttttt tttttttttttttttttttt tttttttttt 60 tttttttttt tttttttttt tttttgtggg ggttttttttgtgggaaaat tggagggggt 120 ttaaaaaaat tccccccccc ccccttttaa aaccccaatagtgggggccc cctggggggg 180 gaaaaacagc ggtgggtggg tcgctgaaaa tgtagtctactaagtagata aaacagctgt 240 gttcttgtgt ggtggcccca cccgttgttc cacatcttctattaatagat agtgtggtgg 300 ggtgccgcag ggaggcgcaa acaacatata ttttctttatttcaaattca tttgtggggg 360 ggaaaaaaac tttattgttc accacacatg cgtgggtagatcacaacagc aaaagaagat 420 gtgtcaaaat aaatgggtgt gctaaagaag ccgggtggcgtggaagacaa acactattag 480 tgatcatgtt gtgtcggagt gtgtgtgatt atcctcccgcgcgggtaaga agagaggtgg 540 tggtctgcaa cacaaagagg ggcggcggga ggaggagagaacaaccatct atcacccgcg 600 ttgcggcgct tatttacata tatatggtcg agggcgagatcaaacatatc tcgagggaga 660 gagagggcga gcgggcgaac ccaaccacgg cgggacaacaagaggccatc tcccgcgggg 720 aggaagaaag gggttgctcc gcgcggcgcg ccccaacccctccacacaac accctatacc 780 gcacacaaca aaccaaacca caacctcgga cacagtcaacaagagaatat aaaaaaatat 840 aacaaaaaaa acaacaaaag tg 862 111 298 DNA Homosapien 111 cacaacatac gagcattgct gaagaaaaaa aatcatcaga agctcttcaggaggtgtgtc 60 gatatggaca gacacagaat cccagaatcc caccacaata agatggagcaattccaaaat 120 aagggctaat ggagcccgaa aggtatcatc gccagcatgg caacaagaatgagccaacag 180 gccgacaaag attgaacgga taatcatagc caccatacga agttctcatgactgtacggg 240 aatacataga caaatcaaac atacgctaca actgtccaag ggaaacatcattccagga 298 112 638 DNA Homo sapien 112 tgatgatcga tatataggccaatggttcat ctaatcatgc tcgagcggcg cagtgtgatg 60 gatcgtggtc gcggccgaggtccccccctt tttttttttt tttttttttt tttttttttt 120 tttttttaaa atttgttaattttttttttg tctcttcccc ccaaaacccc ctttttttaa 180 atttttacgt ttttcggggtttttttaaaa aaaaattaaa acagctttaa acaccttttt 240 ttaaaacgat tatttcaaagaactattttg tcgaaagaaa tactccggga tcccgatatt 300 atcgagcgct ccatctttatatttatcaaa ataatattct agccagggac cgtgcgtttc 360 taagccgatc cggggtgtgcgtcggagcat aaactatggg gaggcaatag caggcagtga 420 ggcagctcgg aggtctgtatacatcggctg ttgtggagta acatgtgtca ttgtccgcgc 480 gtcccaccaa ttccgcaagcaacaaacttt gtaacgagag agcagataca ggagatcagc 540 tcgcaagcga aagtccagtcagcggctaac cacggcagac acgccgagcc aagacgtcgt 600 tgtgcatcgc tagtgcctggtagcgacggg gcggcgct 638 113 783 DNA Homo sapien misc_feature(304)..(304) a, c, g or t 113 gatgatacga ctactatagg cgaattggtcatctagatca tgctcgaagc ggcgcagtgt 60 gatggatcgg gagcgggctg ccgggcaggtccccccccct tttttttttt tttttttttt 120 ttttttccct gtaaagattt tttttttttcctctaaaaaa gtccactttt aaaatggggt 180 tcccggaaaa tttaccaggt ggctctttttaaaaggggca aaagggttgc attccaattc 240 cgggggtttg tttcccccat ccccaatttttgggggctgt ggcaaaaacg gcggctctta 300 gggnaaagag gaggggttgt ttaaagggagacagaggagt ggtataaaac acccgtttgc 360 ttgttgttag gaactcatca atataccatatttctcctac agntgagtgt ggcttcatta 420 ttggggtctg ttcggccgtg gcaccccccttcagatagtg tgtgatttga gagagacaag 480 caaggagggg ccgacgtgtc tccattatctctacagacac caccggtggg ggtgcgggtg 540 cgcgtcgcct tgccaaaggg agaaaagggggggcttatgg cgcgcacacc tttctcaaca 600 agaaacaccg agccccccca antgattggggccagtaatg atgaggccct gggtgggata 660 ctcatggtgg cacataaggc gtcgtctccgggtgttgacc agtgtgttac tcccgctcac 720 aatcccccca aaacatggca ccaacaaaaacatgagagga cgacagcaca cagaacgaag 780 aag 783 114 648 DNA Homo sapien 114acaagcttat aggactctat gacaggctgt gaatgttttt tttgttgttg ttgttgtttt 60taattgctgt taatattttt taaataataa agaaacaaaa ctagaaaaaa aaaaaaaaaa 120aaaaaaaaaa aaaggtgtgg gacttggggg atgtggtgga agggaatata cggtgcccca 180ttatctttta aaccgtgtgt tccccttttt aaataccggg gattattttt ttccaaggga 240cagttttttt aaagaaaact ttggagagtg ggggaggaac cacatggggc aaaacggcgt 300gtccccgggt gggaaatgtg ggtgcaccgg gctcaaaatt cccaccaaac aattcgagac 360aacgaaaaac gaacagcaac aggagaaaga agaacaaaca cgacacacac gaaacagaag 420gagaagagaa agagagagaa acaccaacac acagcaacca agaaaagacg aaaaagaaag 480ggaaaaaaga gaaagaaaag aagaaaaaag agaaaacaag aagaaagaac accagaaaga 540aaagaaaaac acaaagacaa gacaacacac aaaacaaaga aaaacagggc gaacaacaaa 600agaagacaaa aacagcaacg aaaaacagga gagaactaaa acaaagag 648 115 928 DNAHomo sapien 115 aagaattcgg cacgaggttc cctccctagg tcctcaaggt cctccaggctatggcaagat 60 gggtgcaaca ggaccaatgg gccagcaagg catccctggc atccctgggcccccgggtcc 120 catgggccag ccaggcaagg ctggccactg taatccctct gactgctttggggccatgcc 180 gatggagcag cagtacccac ccatgaaaac catgaagggg ccttttggctgaaattcccc 240 acctgccttt ggatgaaaga ctccgttggg aataaatggc caaagcttataggactctgt 300 gacaggttgt gaatgttttt tttgttgttg ttgttgtttt taattgctgttaatattttt 360 taaataataa agaaacaaaa ctaaaaaaaa aaaaaaaaaa aaaaaaaaaaaaaggtgtgg 420 gacttggggg atgtggtgga agggaatata cggtgcccca ttatcttttaaaccgtgtgt 480 tccccttttt aaataccggg gattattttt ttccaaggga cagtttttttaaagaaaact 540 ttggagagtg ggggaggaac cacatggggc aaaacggcgt gtccccgggtgggaaatgtg 600 ggtgcaccgg gctcaaaatt cccaccaaac aattcgagac aacgaaaaacgaacagcaac 660 aggagaaaga agaacaaaca cgacacacac gaaacagaag gagaagagaaagagagagaa 720 acaccaacac acagcaacca agaaaagacg aaaaagaaag ggaaaaaagagaaagaaaag 780 aagaaaaaag agaaaacaag aagaaagaac accagaaaga aaagaaaaacacaaagacaa 840 gacaacacac aaaacaaaga aaaacagggc gaacaacaaa agaagacaaaaacagcaacg 900 aaaaacagga gagaactaaa acaaagag 928 116 82 PRT Homo sapien116 Met Met Arg Glu Ser Phe Phe Val Leu Ala Val Leu Ile Ile Leu Gly 1 510 15 Gly Ala Thr His Pro Pro Pro Pro Pro Pro Cys Ser Thr Pro Ala Val 2025 30 Val Phe Pro Pro Ser Leu Val Gln Pro Val Phe Ile Met Thr Cys Cys 3540 45 Tyr His Val Val Leu Leu Phe Val Ala Pro Leu Cys Gly Gly Pro Pro 5055 60 Pro Leu Glu Arg Ala Ser Pro Val Pro Phe Val Gly Arg Gln Gln Gln 6570 75 80 Ser Ala 117 35 PRT Homo sapien 117 Met Val Phe Phe Phe Phe PhePhe Phe Lys Lys Trp Ser Leu Cys Asn 1 5 10 15 Phe Ala Lys Val Asp PheGlu Phe Arg Gly Pro Ile Asp Pro Pro Thr 20 25 30 Ser Ala Ser 35 118 107PRT Homo sapien 118 Met Tyr Leu Gly Pro Leu Arg Asn Leu Leu Asp Val SerLys Lys Lys 1 5 10 15 Lys Lys His Pro Gln Lys Glu Gln Pro Arg Gly AlaLeu Glu Cys Gly 20 25 30 Ser Pro Leu Ser Val Val Leu Cys Phe Ser Pro IleSer Phe Leu Glu 35 40 45 Ala Arg Glu Gly His Pro Ser Val Gly Ser Ser ThrIle Leu Leu Glu 50 55 60 Ala Ser His Ser Pro Ala Phe Leu Leu Leu Pro LysPro Val Phe Leu 65 70 75 80 Leu His Leu Gly Glu Gly Gly Lys Cys Val ProGly Leu Glu Asn Trp 85 90 95 Cys Leu Thr Gly Lys Val Ser Gly Pro Pro Arg100 105 119 75 PRT Homo sapien 119 Met Ala Thr Pro Val Phe Gln Leu LeuArg Pro Arg Thr Leu Gly Tyr 1 5 10 15 Leu Arg Thr Leu Leu Leu Ser PhePro Met Ser Gly Glu Ser Leu Ser 20 25 30 Phe Val Asp Cys Ala Thr Lys MetTyr Leu Glu Ser Asp His Ile Ser 35 40 45 Gly Thr Ser Ala Ala Thr Arg IleHis His Asn Leu Ala Ala Ala Glu 50 55 60 Gln His Leu Gly Asp Thr Ser ProHis Arg His 65 70 75 120 195 PRT Homo sapien 120 Met Ala Pro Gly Tyr ProPro Ser Phe Leu Lys Lys Lys Trp Leu Leu 1 5 10 15 Glu Asn Lys Arg ArgHis Pro Arg Lys Leu Gly Glu Glu Thr Thr Phe 20 25 30 Cys Pro Ser Pro ProTyr Gly Gly Leu Arg Glu Pro Thr Gly His Arg 35 40 45 Gln Pro Leu Phe SerLeu Asp Arg Ala His Glu Lys Val Pro Pro Arg 50 55 60 Arg Tyr Ile Val LeuVal Gly Thr Gln Ala Ser Gly Pro Val Val Arg 65 70 75 80 Val Arg Asp AsnThr Leu Gly Lys Lys Asn Lys Ser Asn Asn Leu Val 85 90 95 Leu Leu Leu AlaTyr Arg Thr Arg Lys Arg Asn Thr Arg Ser Arg Leu 100 105 110 Arg Leu SerGln His Met Arg Glu Lys Ala Leu Gln Thr Trp Leu Glu 115 120 125 Ser TrpThr Phe Val Lys Gly Glu Lys Ile Val Pro Ala Pro His Val 130 135 140 LeuLeu Thr Ala Leu Arg Ser Thr Gly Asn Pro Gln Arg Lys Gly Gly 145 150 155160 Gly Glu Ser Trp Val Leu Gly Trp Glu Gln Leu Cys Gly Thr Pro Pro 165170 175 Glu Leu Arg Val Trp Val Lys Gly Ser His Asn Ser Phe Phe Lys Lys180 185 190 Asn Lys Phe 195 121 36 PRT Homo sapien 121 Met Ser Cys PhePhe Phe Ala Phe Leu Lys Met Glu Val Thr Ala Lys 1 5 10 15 Trp Glu IleAsn Leu Pro Ile Asn Ser Cys Asn Met Thr Thr Ala Glu 20 25 30 Gln Cys LeuGlu 35 122 117 PRT Homo sapien 122 Met Leu Arg Gly Ala Arg Glu Thr HisIle Ser Thr His His Ala Trp 1 5 10 15 Asn Thr Ala Leu Leu Glu Thr ThrArg Asp Val Tyr Pro Pro Gln Leu 20 25 30 Ser Cys Leu Gly Gly Glu Arg LysIle Trp Leu Val Arg Gln Gly Gly 35 40 45 Phe Val Pro His Leu Arg Gly GlyGly Glu Asn Ile Pro Arg Leu Val 50 55 60 Phe Val Tyr Lys Thr Asn Lys CysLys Lys Leu Ser Thr Asn Phe Phe 65 70 75 80 Gly Thr Lys Gly Val Gly ValSer Arg Arg Ser Phe Ala His Gly Thr 85 90 95 Ala Glu Trp Ser Gln Ser SerVal Glu Thr Lys Ile His Phe Ala Ser 100 105 110 Thr Phe Lys Pro Val 115123 10 PRT Homo sapien 123 Met Gly Arg Ser Leu Glu Val His Gly Val 1 510 124 42 PRT Homo sapien 124 Met Trp Arg Lys Gln Phe Pro Pro Gly GluThr Val Trp Pro Gly Phe 1 5 10 15 Pro Pro Gly Phe Phe Phe Phe Leu LeuCys Phe Phe Gly Asn Ser Phe 20 25 30 Met Thr Phe Asn Leu Thr Met Asn TyrGln 35 40 125 315 PRT Homo sapien 125 Phe Tyr Tyr Lys Thr Lys Ile ThrLys Thr Gly Trp Tyr Trp His Lys 1 5 10 15 Asp Lys His Leu Asp Gln AlaAsn Arg Ile Glu Thr Ala Glu Val Asn 20 25 30 Ser Tyr Ile Tyr Leu Gln LeuAsn Phe Tyr Lys Gly Val Arg Thr Ile 35 40 45 Pro Ser Glu Asn Asn Ile PheAsn Lys Ser Leu Trp Val Asn Cys Ile 50 55 60 Asp Thr Cys Lys Thr Met LysLeu Asp Ser Ala His Ile Leu Tyr Ala 65 70 75 80 Lys Ile Asn Phe Asn AlaLeu Gln Thr Ala Ile Gln Glu Leu Lys Leu 85 90 95 Lys Ile Ile Glu Glu LysVal Arg Val Thr Leu His Asp Leu Ala Phe 100 105 110 Asn Asn Glu Leu SerIle Met Ile Pro Lys Thr Gln Ala Ile Lys Asn 115 120 125 Lys Lys Asp LysArg Gln Pro Thr Lys Trp Glu Lys Ile Cys Ala Asn 130 135 140 Tyr Ile SerAsn Lys Asp Leu Leu Ser Arg Leu Ala Leu Leu Gln Pro 145 150 155 160 TyrThr Lys Thr Ala Leu Ile Ala Lys Leu Pro Lys Asp Leu Asn Arg 165 170 175His Phe Phe Lys Glu Asp Ile Leu Val Ala Gln Lys His Met Lys Arg 180 185190 Cys Ser Ile Ser Leu Ile Ile Arg Glu Met Gln Ile Lys Ser Pro Met 195200 205 Arg Tyr His Phe Thr Pro Thr Arg Met Ala Ile Ile Lys Lys Lys Thr210 215 220 Glu Asn Asn Lys Gly Phe Ser Gly Cys Gly Glu Ile Cys Asn PheIle 225 230 235 240 His Cys Trp Ala Glu Tyr Thr Met Ala Gln Pro Pro TrpArg Thr Val 245 250 255 Trp Glu Val Leu Gln Lys Val Glu Gln Asn Tyr AsnMet Thr Gln Gln 260 265 270 Ile Pro Leu Leu Asp Ile Tyr Pro Gln Lys AsnLys Thr Cys Cys Pro 275 280 285 Leu Lys Pro Cys Thr Gln Met Phe Thr AlaIle Leu Phe Ile Ile Ala 290 295 300 Lys Lys Lys Val Glu Thr Thr Asn GlnTrp Ile 305 310 315 126 66 PRT Homo sapien 126 Met Phe Leu Pro Tyr GlyLys Ser Glu Ala Ala Arg Glu Ala Ser Gly 1 5 10 15 Ala Cys Lys Thr ThrAsp Gly Ile Val Ser Glu Leu Thr Met Asn Thr 20 25 30 Cys Ser Pro Leu SerIle Asp Gln Ser Lys Ser Asn Val Val Gly Lys 35 40 45 Gly Pro Ser Pro ThrVal Gly Gly Glu Gly Cys Gly His Leu Pro Leu 50 55 60 Ala Asp 65 127 40PRT Homo sapien 127 Met Glu Thr Lys Tyr Val His His Gln His Ile Phe TyrTyr Arg Leu 1 5 10 15 Pro Asn Ile Arg Phe Thr Asn Phe Ser Asn Phe ProThr Arg Asp Leu 20 25 30 Ser Phe Asn Val Pro Arg Asn Tyr 35 40 128 80PRT Homo sapien 128 Met Gly Val Gly Ala Gly Arg Thr Phe Phe Thr Arg GlyPro Ser Ser 1 5 10 15 Gly Pro Val Val Arg Arg Asn Ala Leu Pro Phe PhePhe Leu Lys Lys 20 25 30 Gly Val Ser Cys Leu Phe Cys His Arg Leu Gly GlyHis Asn Trp Glu 35 40 45 Gln Ile Val Gly Gly Ser Val Ile Arg Phe His ProPro Thr Gly Val 50 55 60 Tyr Ser Ala Ile Leu Pro Val Ala Arg Leu Pro CysLeu Pro Trp Arg 65 70 75 80 129 88 PRT Homo sapien 129 Met Tyr Leu SerPhe Met Ser Pro Arg Arg His Thr Gln Lys Val Lys 1 5 10 15 Ser Pro GlyArg Gly Leu Arg Ser Leu Pro Ser Gly Leu Pro Pro Ala 20 25 30 Arg Glu AlaPro Gln Cys Gly Arg Pro Leu Pro Arg Pro Thr Pro Arg 35 40 45 Leu Cys ProVal Pro Thr Leu Ala Val Trp Ala Thr Pro Ser Glu Leu 50 55 60 Leu Glu AlaThr Asn Thr Gln Val Ser Tyr Ser Thr Ser Thr Asp Pro 65 70 75 80 Gly LeuMet Gly Leu Tyr Ile Lys 85 130 49 PRT Homo sapien 130 Met Asn Gln AsnArg Gly Ser Gln Ser Arg Glu Lys Lys Ile Leu Gly 1 5 10 15 Ser Glu SerThr Leu Cys Pro Phe Glu Leu Gln Lys Glu Lys Glu Thr 20 25 30 Lys Ala LysSer Asn Gly Gly Gln Ala Ala Arg Tyr Leu Pro Gly Arg 35 40 45 Arg 131 87PRT Homo sapien 131 Met Val Val Phe Val Ser Cys Met Tyr Arg Phe Cys SerLeu Arg Leu 1 5 10 15 Leu Thr Val Gly Arg Arg His Lys Met Gly Ala AspCys Phe Ser His 20 25 30 Asn Ile Cys Gly Gly Asn Cys Gly Ala Gly Met ThrPro His Phe Gln 35 40 45 His Gln Gly Thr Ser Val Met Ala His Glu Phe SerVal Pro Ser Phe 50 55 60 Ser Cys Glu Ser Gln Asp Ile Ser Cys Ala Phe SerHis Lys Asp Thr 65 70 75 80 Arg Glu Gly Pro Gly Val His 85 132 26 PRTHomo sapien 132 Met Leu Ser Ser Gly Ala Val Val Met Ile Glu Arg Arg ProGly Gln 1 5 10 15 Val Leu Ala Leu Lys Thr Ile Thr Lys Gln 20 25 133 519PRT Homo sapien 133 Met Thr Cys Pro Asp Lys Pro Gly Gln Leu Ile Asn TrpPhe Ile Cys 1 5 10 15 Ser Leu Cys Val Pro Arg Val Arg Lys Leu Trp SerSer Arg Arg Pro 20 25 30 Arg Thr Arg Arg Asn Leu Leu Leu Gly Thr Ala CysAla Ile Tyr Leu 35 40 45 Gly Phe Leu Val Ser Gln Val Gly Arg Ala Ser LeuGln His Gly Gln 50 55 60 Ala Ala Glu Lys Gly Pro His Arg Ser Arg Asp ThrAla Glu Pro Ser 65 70 75 80 Phe Pro Glu Ile Pro Leu Asp Gly Thr Leu AlaPro Pro Glu Ser Gln 85 90 95 Gly Asn Gly Ser Thr Leu Gln Pro Asn Val ValTyr Ile Thr Leu Arg 100 105 110 Ser Lys Arg Ser Lys Pro Ala Asn Ile ArgGly Thr Val Lys Pro Lys 115 120 125 Arg Arg Lys Lys His Ala Val Ala SerAla Ala Pro Gly Gln Glu Ala 130 135 140 Leu Val Gly Pro Ser Leu Gln ProGln Glu Ala Ala Arg Glu Ala Asp 145 150 155 160 Ala Val Ala Pro Gly TyrAla Gln Gly Ala Asn Leu Val Lys Ile Gly 165 170 175 Glu Arg Pro Trp ArgLeu Val Arg Gly Pro Gly Val Arg Ala Gly Gly 180 185 190 Pro Asp Phe LeuGln Pro Ser Ser Arg Glu Ser Asn Ile Arg Ile Tyr 195 200 205 Ser Glu SerAla Pro Ser Trp Leu Ser Lys Asp Asp Ile Arg Arg Met 210 215 220 Arg LeuLeu Ala Asp Ser Ala Val Ala Gly Leu Arg Pro Val Ser Ser 225 230 235 240Arg Ser Gly Ala Arg Leu Leu Val Leu Glu Gly Gly Ala Pro Gly Ala 245 250255 Val Leu Arg Cys Gly Pro Ser Pro Cys Gly Leu Leu Lys Gln Pro Leu 260265 270 Asp Met Ser Glu Val Phe Ala Phe His Leu Asp Arg Ile Leu Gly Leu275 280 285 Asn Arg Thr Leu Pro Ser Val Ser Arg Lys Ala Glu Phe Ile GlnAsp 290 295 300 Gly Arg Pro Cys Pro Ile Ile Leu Trp Asp Ala Ser Leu SerSer Ala 305 310 315 320 Ser Asn Asp Thr His Ser Ser Val Lys Leu Thr TrpGly Thr Tyr Gln 325 330 335 Gln Leu Leu Lys Gln Lys Cys Trp Gln Asn GlyArg Val Pro Lys Pro 340 345 350 Glu Ser Gly Cys Thr Glu Ile His His HisGlu Trp Ser Lys Met Ala 355 360 365 Leu Phe Asp Phe Leu Leu Gln Ile TyrAsn Arg Leu Asp Thr Asn Cys 370 375 380 Cys Gly Phe Arg Pro Arg Lys GluAsp Ala Cys Val Gln Asn Gly Leu 385 390 395 400 Arg Pro Lys Cys Asp AspGln Gly Ser Ala Ala Leu Ala His Ile Ile 405 410 415 Gln Arg Lys His AspPro Arg His Leu Val Phe Ile Asp Asn Lys Gly 420 425 430 Phe Phe Asp ArgSer Glu Asp Asn Leu Asn Phe Lys Leu Leu Glu Gly 435 440 445 Ile Lys GluPhe Pro Ala Ser Ala Val Ser Val Leu Lys Ser Gln His 450 455 460 Leu ArgGln Lys Leu Leu Gln Ser Leu Phe Leu Asp Lys Val Tyr Trp 465 470 475 480Glu Ser Gln Gly Gly Arg Gln Gly Ile Asp Lys Leu Ile Asp Val Ile 485 490495 Glu His Arg Ala Lys Ile Leu Ile Thr Tyr Ile Asn Ala His Gly Val 500505 510 Lys Val Leu Pro Met Asn Glu 515 134 66 PRT Homo sapien 134 MetGly Arg Asp Lys Ser Glu Val Thr Val Asn Asn Lys Val Met Phe 1 5 10 15Tyr Gly Tyr Phe Ile Gly Asp Lys Phe Ile Thr Arg Ala Ile Ser Tyr 20 25 30His Val Leu Ile Leu Pro Gly Cys Asn Met Val Thr Leu Glu Thr Gln 35 40 45Ile Leu Asn Ile Gly Val Lys Thr Thr Ser Cys His Ser Ile Leu Ser 50 55 60Thr Val 65 135 91 PRT Homo sapien 135 Met Val Cys Val Val Val Ala CysGly Trp Ala Asp Val Cys Val Pro 1 5 10 15 Ser Trp Cys Val Leu Cys CysSer Val Val Ser Trp Leu Val Val Val 20 25 30 Cys Trp Cys Leu Tyr Ala SerVal Leu Cys Glu Ser Ala Val Thr Val 35 40 45 Val Ala Leu Leu Cys Ser LeuAla Ser Ala Ser Val Gly Val Trp Trp 50 55 60 Ser Val Phe Trp Trp Cys SerPhe Leu Leu Cys Val Leu Cys Val Val 65 70 75 80 Phe Asp Arg Leu Arg TrpPro Ala Ile Cys Thr 85 90 136 76 PRT Homo sapien 136 Met Leu Thr Cys SerGly Phe His Gly Thr Asp Tyr Pro Phe Ile Asn 1 5 10 15 Thr Glu Asn ArgLys Thr Thr Gln Lys Lys Lys Lys Thr Gln Thr Leu 20 25 30 Gly Arg Gln ProGly Val Pro Thr Val Leu Pro Arg Cys Gly Leu Thr 35 40 45 Leu Cys Thr ArgPro Thr Asn Leu Pro Pro Thr His Phe Ser Asn His 50 55 60 Asn Thr Ser ThrPro Leu Thr Lys Asp Ser Thr Ile 65 70 75 137 101 PRT Homo sapien 137 MetTrp Leu Ser Pro Ala Ser His Asn Ser Pro Pro Gln His Ser Gly 1 5 10 15Arg Asp Thr Lys Thr Ser Thr Gln Arg Gly Gly Val Thr Arg Thr Asn 20 25 30Ser Gly Ala Asp Glu Pro His Asn Lys His Ile Glu Thr Glu Ile Thr 35 40 45Lys Thr Asp Thr Asn Asn Arg Asp Thr Gln Arg Thr Lys Gln Ala Gln 50 55 60Lys Pro Asn Lys Glu Glu Ala Arg Lys Ala Gln Pro Thr Ser Thr Thr 65 70 7580 Thr Asn Lys Thr Asn Gly Thr Lys Glu His Ser Lys Gln Gln Thr Pro 85 9095 Thr His Asn His Thr 100 138 80 PRT Homo sapien 138 Met Val Cys AlaAla Trp Leu Pro Ser Ala Cys Pro Pro Trp Ser Val 1 5 10 15 Asp Ala ProSer Thr Pro Leu Leu Gly Pro Cys Gln Pro Leu Val Val 20 25 30 Glu Phe SerSer Pro Gly Val Val Val Gly Gly Pro Ser Met Ser Val 35 40 45 Trp Gly GlyArg Leu Arg Cys Pro His Trp Met Gln Pro Phe Ser Thr 50 55 60 Ile Ser GlyLeu Lys Arg Asp Arg Val Arg Asn Val Asp Pro Leu Val 65 70 75 80 139 36PRT Homo sapien 139 Met His Leu Glu Arg Arg Ser Val Met Asp Gly Glu ValAsn Leu Ile 1 5 10 15 Ser Leu Ser Gly Phe Leu Ser Tyr Cys Ile Phe IleTyr Lys Thr Asn 20 25 30 Phe Ile Leu Lys 35 140 45 PRT Homo sapien 140Met Trp Asn Phe Val Phe Leu Leu Ile Gly Gly Gly Gly Leu Ile Arg 1 5 1015 Gly Val Val Cys Ala Pro Arg Arg Met Val Gly Val Cys Glu Asn Asn 20 2530 Lys Lys Asn Val Leu Arg Arg Glu Arg Gly Val Val Cys 35 40 45 141 136PRT Homo sapien 141 Met Gly Trp Asn Thr Val Gly Arg Ser Gln Leu Ser AlaAla Leu Asn 1 5 10 15 Ser Trp Ala Gln Ala Met Phe Ser Pro Gln Leu ProSer Ser Trp Ala 20 25 30 Cys Arg His Val Ser Ala Cys Leu Ala Tyr Phe LeuPhe Phe Phe Phe 35 40 45 Ser Phe Phe Phe Phe Leu Phe Phe Phe Phe Tyr PhePhe Phe Leu Leu 50 55 60 Lys Arg Ala Gly Gly Gly His Ile Met Val Trp ArgArg Arg Arg Trp 65 70 75 80 Ser Leu Gln Thr Ser Gly Val Pro Glu Val ValPhe Ser Ala Glu Cys 85 90 95 Cys Val Thr Thr Arg Cys Arg Gly Ser Thr ArgTrp Gly Lys Glu Ser 100 105 110 Val Ala Trp Gly Arg Asn Val Val Val AlaArg Pro Asn Phe Ala Pro 115 120 125 Lys Ile Ala Arg Thr His Glu Asn 130135 142 51 PRT Homo sapien 142 Met Asp Gln Ile Phe Pro Lys Arg Tyr LeuMet His Asn Ala Lys Lys 1 5 10 15 Thr Lys Lys Lys Lys Lys Arg Gly GlyLys Pro Ala Gln Glu Arg Ala 20 25 30 Arg Gly Glu Thr Gly Val Pro Gly ProAsn Phe Pro Lys Lys Phe Ala 35 40 45 Thr Gln Lys 50 143 219 PRT Homosapien 143 Met Val Leu Ala Leu Ile Val Asp Leu Cys Leu Trp Leu Ser ProArg 1 5 10 15 Thr Gly Ala Gly Arg Leu Thr Ser Phe Leu Ser Leu Ser LeuCys Arg 20 25 30 Leu Ser Leu Cys Leu Phe Tyr Leu Phe Gly Val Ser Gly GlyTrp Cys 35 40 45 Gly Asp Ser Ser Ser Phe Ser Val Leu Pro Pro Arg Ile ArgPhe Arg 50 55 60 Gly Arg Arg Ala Ala Val Val Ala Ser His Leu Leu Ile SerAla Pro 65 70 75 80 Pro Leu Phe Cys Val Val Phe Leu His Cys Cys Ser AlaVal Cys Ser 85 90 95 Ser Trp Arg Arg Val Ser Gly Leu Cys Arg Pro Pro LeuLeu Arg Ser 100 105 110 Ser Arg Phe Cys Arg Arg Pro Leu Leu Leu Ser PheIle Thr Pro His 115 120 125 Leu Ser Ser Ser Arg Arg Gly Val Val Thr PheGly Leu Val Leu Pro 130 135 140 Phe Phe Trp Trp Leu Gly Arg Arg Ala HisAsp Phe Phe Val Ser Pro 145 150 155 160 Arg Trp Leu Gly Ala Pro Gly ProPro Lys Lys Lys Pro Pro Pro Pro 165 170 175 Pro Thr Pro Gln Lys Lys LysThr Pro Pro Pro Pro Pro Lys Lys Lys 180 185 190 Lys Lys Lys Lys Lys LysLys Lys Lys Lys Lys Lys Lys Lys Lys Lys 195 200 205 Lys Lys Lys Gly GlyGly Thr Ser Ala Ala Thr 210 215 144 37 PRT Homo sapien 144 Met Arg SerPhe Arg Glu Ile His Ser Glu Arg Thr Leu Met Val Asn 1 5 10 15 Leu ArgGly Lys Ser Gln Asp Ala Gln Lys Leu Trp Ser Leu Val Leu 20 25 30 Ile SerGln Ser Ile 35 145 280 PRT Homo sapien 145 Met Val Val Phe Gly Val IleCys Leu Cys Cys Val Cys Pro Ile Leu 1 5 10 15 Phe Phe Ser Val Phe LeuPhe Val Val Val Cys Ser Val Val Cys Leu 20 25 30 Leu Ser Leu Val Ser AlaGly Cys Leu Val Gly Glu Leu Pro Phe Cys 35 40 45 Phe Ser Phe Val Leu CysVal Leu Gly Arg Ala Leu Ser Leu Leu Pro 50 55 60 Ser Leu Val Val Trp LeuLeu Ser Ser Ser Leu Cys Val Ser Leu Trp 65 70 75 80 Ser Phe Leu Leu PheLeu Val Leu Val Val Leu Val Ser Arg Gly Phe 85 90 95 Phe Ser Phe Val SerGly Ile Cys Val Cys Val Leu Cys Leu Leu Ser 100 105 110 Phe Val Phe ValVal Cys Cys Arg Leu Arg Leu Phe Ile Ser Arg Leu 115 120 125 Cys Leu LeuArg Phe Leu Tyr Leu Ser Ser Val Cys Phe Ser Leu Phe 130 135 140 Phe SerPhe Ala Val Val Ser Arg Val Leu Phe Pro Thr Arg Gly Cys 145 150 155 160Val Leu Leu Trp Leu Arg Gly Asp Thr Gln Ile Leu Trp Gly Gly Lys 165 170175 Val Cys Gly Arg Arg Pro Arg Gly Asp Thr Pro His Met Met Phe Pro 180185 190 His Pro His Ala Gly Leu Ile Thr Ala Leu Phe Gly Ala Pro Thr Arg195 200 205 Gly Val Tyr Ser Pro Pro Thr Ala Arg Phe Phe Val Val Tyr IleIle 210 215 220 Gly Asp Thr Ser Phe Phe Arg Gly Gly Pro His His Tyr LeuGly Gly 225 230 235 240 Ser Thr His Leu Gly Glu Thr Pro Arg Ala Val SerSer Leu Ile Ile 245 250 255 Tyr Ile Lys Ile Tyr Gly Ala Arg Asp Arg ArgTyr Ile Thr Arg Gly 260 265 270 Leu Ser Phe Val Asp Ser Glu Lys 275 280146 95 PRT Homo sapien 146 Met Pro Val Val Pro Ala Ile Trp Glu Ala LysGlu Asp Arg Leu Ser 1 5 10 15 Ser Gly Asp Arg Gly Cys Ser Gly Leu ArgSer Ala Pro Gln Pro Ser 20 25 30 Ser Leu Val Lys Arg Glu Arg Phe His ArgLeu Ile Asn Gln Gln Thr 35 40 45 Lys Thr Arg Ile Tyr Asp Gln Ala Gln TrpLeu Thr Pro Ile Ile Pro 50 55 60 Val Leu Trp Glu Ala Arg Ala Gly Arg PhePhe Glu Val Arg Ser Ser 65 70 75 80 Arg Pro Ala Trp Ala Thr Gln Gly AspPro Val Ser Thr Lys Val 85 90 95 147 90 PRT Homo sapien 147 Arg Ile TyrAsp Gln Ala Gln Trp Leu Thr Pro Ile Ile Pro Val Leu 1 5 10 15 Trp GluAla Arg Ala Gly Arg Phe Phe Glu Val Arg Ser Ser Arg Pro 20 25 30 Ala TrpAla Thr Gln Gly Asp Pro Val Ser Thr Lys Ser Leu Lys Ile 35 40 45 Ser AlaVal Trp Trp His Thr Ser Val Val Ser Pro Thr Leu Glu Ala 50 55 60 Glu ValAsp Cys Ser Ser Pro Gly Val Gln Ala Ser Val Ser Tyr Asp 65 70 75 80 HisSer Thr Ala Leu Pro Ala Arg Gln Glu 85 90 148 79 PRT Homo sapien 148 MetSer Ser Leu Leu Pro Ala Phe Phe Val Ser Ile Asn Val Thr Ser 1 5 10 15Thr Tyr Pro Val Ile Gln Gly Lys Thr Gln Trp Arg Lys Pro Ser Ser 20 25 30Thr Thr His Ser Leu Tyr Leu Thr Leu Ser Gln His Pro Ala Lys Ser 35 40 45Arg Ser Lys Tyr Ser Ser Ser Leu Ser Thr Ser Leu Pro Phe Leu Gln 50 55 60Ser Ile Thr Leu Val Tyr Ser Ile Thr Ile Ser Gln Leu Asp Tyr 65 70 75 14932 PRT Homo sapien 149 Met Gly Ser Thr Thr Asp Val Ser Gly Ser Gln CysGly Cys Gln Phe 1 5 10 15 Leu Tyr Leu Ala Ala Thr Thr Leu Ser Ile ThrLeu Arg Arg Ser Arg 20 25 30 150 57 PRT Homo sapien 150 Met Gly Leu ThrLeu Leu Leu Tyr Ser Ile Gly Glu Lys Asn Tyr Ile 1 5 10 15 Pro Thr GluLys Thr Glu Gly Glu Ala Ile Thr Thr Thr Lys Gln Ser 20 25 30 Val Thr ProArg Arg Glu Glu Met Gly Phe Pro Arg His Thr Pro His 35 40 45 Asn His LeuGln Gln Pro Gln Pro Ser 50 55 151 28 PRT Homo sapien 151 Met Phe Arg GlyGln Ala Asp Ile Ile Thr Trp Cys Thr Phe Ser Ser 1 5 10 15 His Cys LeuAla Lys Gly Ser Arg Ser Thr Ser Ser 20 25 152 13 PRT Homo sapien 152 MetSer Ser Gly Ala Gly Glu Asp Ser Gly Ala Gly Arg 1 5 10 153 87 PRT Homosapien 153 Met Gly Ala Leu Phe Pro Leu Pro Arg Tyr Ile Leu Thr Arg LeuArg 1 5 10 15 Ser Val Val Leu Ala Cys Gly Arg Val Glu Asn Gln Gly SerLeu Lys 20 25 30 Met Cys Gly Leu Tyr Thr Val Tyr Pro Gln Asn Ser Gly AspAsn Ala 35 40 45 Gly Glu Asn Asn His Val Glu Thr Lys Lys Cys His Ala AsnLys Gly 50 55 60 Gln Glu Pro Gly Lys Lys Gly Ser Arg Phe Val Cys Asp ValIle Phe 65 70 75 80 His Met Ala Ser Ser Pro His 85 154 57 PRT Homosapien 154 Met Ser Tyr Val Pro Cys Phe Tyr Ser Asn Val Asn Ser Ser AsnPhe 1 5 10 15 Phe Ala Phe Phe Leu Leu Val Asn Val Cys Val Ile Ser PheVal Phe 20 25 30 Ile Asp Val Thr Trp Phe Tyr Phe Phe Ile Leu Leu Gln PheThr Ser 35 40 45 Ile Ser Gly Thr Leu Phe Ala Ala Lys 50 55 155 115 PRTHomo sapien 155 Met Phe Val Gly Gly Glu Leu Leu Arg Pro Glu Glu Pro GlnPhe His 1 5 10 15 Pro Thr Gly Thr His Thr Tyr Ser Thr Gln Glu Val ProPro Lys Arg 20 25 30 Phe Phe Phe Phe Phe Phe Phe Phe Cys Asn Leu Pro LysSer Asn His 35 40 45 Pro Thr Phe Leu Glu Ile Leu Lys Thr Pro Lys Arg LysIle Ile Ser 50 55 60 Asn Asn Ser Thr Pro Thr Ser Lys Ala Phe Val Met ArgHis Ser Gln 65 70 75 80 Ser Ile Phe Phe Phe Phe Phe Phe Leu Val Arg ValSer Val Thr Gln 85 90 95 Ala Gly Ile Gln Trp Cys Asp Leu Ser Ser Pro GlnPro Pro Pro Pro 100 105 110 Arg Phe Lys 115 156 67 PRT Homo sapien 156Met Cys Val Tyr Ile Ser Pro Gly Ser Thr His Lys Phe Ser His Thr 1 5 1015 Pro His Thr His Ile Ile Leu Gly Arg Ala Thr Gln Asn Ala Lys Lys 20 2530 Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Met Lys Lys Lys 35 4045 Lys Lys Lys Lys Lys Lys Glu Lys Ile Lys Glu Asn Gln Arg Gln Thr 50 5560 Glu Lys Thr 65 157 51 PRT Homo sapien 157 Met His Ile Tyr Leu Val ArgIle Pro Phe Gly Leu Leu Asn Arg Leu 1 5 10 15 Thr Leu Glu Phe Ala GlnAsp Thr Glu Ala Asn Leu Ser Ala Gly Lys 20 25 30 Asn Pro Asp Ala Pro HisIle Leu Arg Glu Pro His Met Ser Cys Ser 35 40 45 Tyr Cys Cys 50 158 135PRT Homo sapien 158 Met Phe Phe Val Arg Ala Cys Ile Leu Phe Tyr Thr GlnTyr Leu Ser 1 5 10 15 Phe Glu Trp His Leu Gln Tyr Ala Ala Pro Thr ProSer Phe Cys Ser 20 25 30 Leu Arg His Leu Leu Cys Ser Cys Leu Pro His PheTyr Cys Leu Val 35 40 45 Val Cys Leu Leu Pro Ala Ser Leu Ser Val Leu ProPro Phe Leu Phe 50 55 60 Leu Pro Leu Leu Ala Leu Asp Thr Leu Phe Ala ValThr Arg Lys Cys 65 70 75 80 Leu Cys Gly Gly Lys Phe Val Glu Ser Arg GluArg Tyr Thr His Ile 85 90 95 Val Thr His Thr Arg Gly Thr His Ser Tyr TrpArg Pro Gln Arg Val 100 105 110 Phe Thr Pro Gln Arg Leu Phe Ser Leu PheIle Ile Ser Pro Arg Glu 115 120 125 Lys Asn Tyr Lys Glu Val Ile 130 135159 102 PRT Homo sapien 159 Met Arg Val Val Pro Glu Met Val His Val ValGln Val Ile Cys Leu 1 5 10 15 Leu Met Phe Val Ser Leu Phe Ile His GlyVal Asp Trp Arg Glu Gly 20 25 30 Thr Lys Ser Ile Cys Leu Tyr Ile Arg ThrSer Val Val Arg Cys Ile 35 40 45 Phe His Val Thr Ser Leu Leu Glu Asp GlnThr Pro Tyr Val Leu Gln 50 55 60 Tyr Ala Leu Pro Met Ala Val Leu Arg ArgLys Leu Arg Leu Phe Cys 65 70 75 80 Phe Asn Arg Gly Trp Cys Thr Trp LeuSer Lys Tyr Ser Val Lys Ser 85 90 95 Ser Ile Ser Glu Gly Asn 100 160 21PRT Homo sapien 160 Met Ser Val Leu Ser Val Ala Glu Leu Ser Val Ser TrpHis Ser Cys 1 5 10 15 Ala Cys Val Lys Leu 20 161 16 PRT Homo sapien 161Met Thr Thr Ser Val Val Asn Phe Arg Asn Tyr Phe Phe Thr Ser Val 1 5 1015 162 85 PRT Homo sapien 162 Met Arg Gly Phe Leu Phe Pro Asp Gly IleGln Gly Ala Thr Ser Phe 1 5 10 15 Phe Leu Pro Gly Lys Lys Arg Tyr ThrCys Cys Leu Asp Ser Ser Pro 20 25 30 His Phe Pro Pro Val Leu His His GlyPro Leu Asn Phe Leu Phe Val 35 40 45 Leu Leu Pro Pro Ser Asn Asn His GluAsn Asn Leu Gly Glu Val Phe 50 55 60 Gln Ile Met Lys Lys Lys Gln Lys LysGln Lys Asn Asn Gln Arg Gly 65 70 75 80 Asp Leu Gly Arg Asp 85 163 40PRT Homo sapien 163 Met Tyr Leu Thr Leu Ser Phe Ser Val Met Tyr Asn CysHis Phe Leu 1 5 10 15 Ile Leu Tyr Ile Met Tyr Leu Phe Asp Ile Arg PheAsn Asn Tyr Ile 20 25 30 Asn Phe Ile His Ser Leu Phe Glu 35 40 164 33PRT Homo sapien 164 Met Ser Pro Gln Gln Thr Ile Leu Arg Val Ile Pro GluGln Lys Ser 1 5 10 15 Thr Thr Thr Gln Leu Thr Leu Ile Leu Ser Leu ThrLys Ser Ile Thr 20 25 30 Leu 165 46 PRT Homo sapien 165 Met Glu Leu ProPhe Asn Lys Glu Ile Leu Pro Lys Gln Lys Lys Lys 1 5 10 15 Lys Lys LysLys Lys Gly Trp Gly Ser Trp Pro Ala Val Pro Val Leu 20 25 30 Asn Trp PheSer Gly Pro Lys Phe Pro Lys Ile Arg Glu Gln 35 40 45 166 24 PRT Homosapien 166 Met Ala Ile Val Pro Leu Asp His Ala Ser Ser Gly Ala Ser CysAsp 1 5 10 15 Gly Leu Val Ala Ala Arg Tyr Asn 20 167 75 PRT Homo sapien167 Met Thr Thr Tyr Ala Ile Gly Cys Glu Asp Glu Ala Ile Ala Ala Lys 1 510 15 Pro Gly Val Ser Asn Asp Asn Glu Arg Arg Pro Cys Thr Ile Val Leu 2025 30 Glu Leu Arg Arg Glu Pro Leu Ser Leu Ser Ser Pro Ile Ser Lys Ala 3540 45 Leu Pro Val Asn Gln Glu Thr Ala Cys Thr Thr Cys Val Glu Gln Ser 5055 60 Leu Ser Leu Leu His Asp Ala Pro Met Leu Val 65 70 75 168 91 PRTHomo sapien 168 Met Leu Cys His His Val Ile Arg Tyr Asn Leu His Phe SerVal Leu 1 5 10 15 Thr Ser His Pro Ile Tyr Thr Val Leu Tyr Ala His LysCys Ile Gly 20 25 30 Gly Arg His Gln Phe Val Met Ala His Val Ser His AsnMet Lys Tyr 35 40 45 Leu Glu Glu Leu Leu Tyr Val Gly Glu Cys Pro Tyr ValGly Val Asn 50 55 60 Val Ser Met Tyr Phe Leu Arg Val Ala Arg Pro Thr CysLeu Leu Cys 65 70 75 80 Phe Thr Tyr Asp Phe Tyr Thr Arg Ala Arg Ala 8590 169 211 PRT Homo sapien 169 Met Ala Ala Glu Ala Thr Thr Glu Arg ArgArg Arg Glu Ser Glu Glu 1 5 10 15 Thr Arg Arg Arg Glu Arg Ala Arg ArgArg Asn Glu Arg Arg Lys Arg 20 25 30 Gly Ala Glu Ala Glu Arg Gly Asp ArgThr Ala Arg Glu Glu Ser Glu 35 40 45 Ala Pro Asn Gly Glu Arg Asn Asn GluArg Glu Thr Asp Glu Thr Arg 50 55 60 Thr Gln Arg Arg Arg Arg Thr Thr HisArg Gln Arg Arg Glu Lys Thr 65 70 75 80 Ser Arg Glu Ala His His Gly GlnSer Ala Glu Ala Gln Pro Gln Glu 85 90 95 Thr Thr Thr Gly Pro Arg Glu GlnArg Arg Gln Met Arg Ala Glu Ala 100 105 110 Thr Arg Thr Thr Val Lys AspGln Asp Glu Thr Ser Ser Lys Glu Lys 115 120 125 Arg Arg Met Arg Thr HisAsn Ile Lys Ile Arg Gln Thr Arg Ser Gly 130 135 140 Thr His Asp Ala ArgGln Arg Glu Glu Arg His Thr Thr Asn Lys His 145 150 155 160 Ala Arg SerArg Gly Gln His Glu Arg Lys Gln Pro Glu Gln Lys Gln 165 170 175 Glu SerAla Gly Lys Arg Arg Gly Asp Ser Ser Asn Arg Arg Ala Thr 180 185 190 GlnArg Arg Lys Arg Leu Glu Lys Glu Lys Thr Gln Lys Thr Arg His 195 200 205Gly Arg His 210 170 82 PRT Homo sapien 170 Met Phe Ile Ser Val Phe HisVal Trp Phe Val Ala Val Val Val Gly 1 5 10 15 Glu Ile Gly Ser Arg GlyLys His Asn Phe Tyr Thr Pro Arg Asn Gln 20 25 30 Arg Leu Ala Pro Arg SerPhe Pro Arg Pro Ala Ser Leu Val Tyr Thr 35 40 45 Arg Asn Ile Ser Cys SerPhe Ser Pro Gln Arg Thr His Gly Arg Asp 50 55 60 Thr Gly Ser Leu Gly ProHis Val Met Lys Arg Tyr Trp Ala Pro Pro 65 70 75 80 Thr Ala 171 153 PRTHomo sapien 171 Met Ser Leu Ala Asp Gly His Ser Trp Arg Pro Gln Phe MetPhe Asn 1 5 10 15 Arg Asn Ser Leu Arg Asn Ile Leu Arg Leu Pro His ProLeu Val Val 20 25 30 Leu Pro Ser Phe Leu Pro Ser Leu Arg Val Lys Gly ProArg Gly Pro 35 40 45 Phe Trp Val Leu Leu Trp Lys Ala Arg Asp Val Ser ValPhe His Arg 50 55 60 Thr Ala Trp Arg Pro Lys His Pro Gly Ala Pro Ile GlyArg Gly Ser 65 70 75 80 Pro Gly Gly Val Thr Val Trp Phe Tyr Arg Arg SerPro Lys Leu Pro 85 90 95 Pro Pro His His Cys Gln Gln Gln Lys Val Gly ProLeu Gly Ala Gly 100 105 110 Ala Thr Met Leu Asn Thr Gly Ser Ser Arg GluHis Ala Ala Gln Ala 115 120 125 Thr Lys Ala Gly Arg Ser Lys Thr Gln AlaHis Thr Lys Asn Glu Ile 130 135 140 Ser Lys Gln Ala Thr Glu Gln Ala Ser145 150 172 32 PRT Homo sapien 172 Met Gln Pro Arg Gly Ser Thr Asp AsnArg Ile Leu Lys Lys Val Ala 1 5 10 15 Ala Pro Pro Val Ile Ile Asn AsnLeu Ile Lys Phe Thr Glu Leu Tyr 20 25 30 173 48 PRT Homo sapien 173 MetSer Val Gly Trp Asp Cys Ser Gln Val Tyr Ile Thr Lys Arg Ile 1 5 10 15Gly Ala Thr His Val Gly Phe Met Phe Cys Asp Val Leu Ser Ile Cys 20 25 30Val Asn Ala Phe His Met Val Ser Gly Leu Glu Cys Tyr Gly Pro Leu 35 40 45174 17 PRT Homo sapien 174 Met Lys Thr Gln Glu Lys Arg Met Val Asn LysGlu Asp Pro Asn Tyr 1 5 10 15 Leu 175 132 PRT Homo sapien 175 Val ValMet Thr Leu Asn Glu His Ala Ala Phe Lys His Leu Phe Asn 1 5 10 15 LysAla His Leu Ala Pro Pro Leu Ile His Leu Thr Leu Ser Gly His 20 25 30 SerThr Cys Phe Arg Glu His Arg Val Gly Asp Lys Val Thr Asp Gln 35 40 45 GlnAsp Pro Lys Ala Glu Glu Phe Phe Leu Val Gln Asn Lys Met Lys 50 55 60 SerLeu Pro Cys Leu Leu Leu Ser Thr Glu Thr Arg Gln Pro Ser Asp 65 70 75 80Phe Ser Ile Phe Ser Pro Leu Phe Pro Leu Phe Tyr Ser Thr Lys Pro 85 90 95Pro Leu Ser Ser Trp Pro Val Leu Asn Glu Leu Leu Gly Thr Pro Pro 100 105110 Arg Arg Gly Gly Gly Arg Ala Glu Gly Leu Leu Thr Ser Gln Gly Leu 115120 125 Leu Thr Ser Gln 130 176 114 PRT Homo sapien 176 Met Ile Glu LeuLeu Ser Ser Ser Val Tyr His Glu Gly Pro Pro His 1 5 10 15 Ala Val PheGly Ala Pro Val Leu Pro Pro Ser Val Ser Cys Ile Val 20 25 30 Cys Thr ThrPro Pro Gln Leu Gly Gly Pro Pro Pro Pro Pro Pro Leu 35 40 45 Val His AlaThr Phe Pro Pro Pro Phe Pro Arg Thr Thr Pro Pro Phe 50 55 60 Phe Thr ProPro Pro Pro Pro Phe Leu Leu Phe Pro Pro Pro Pro Pro 65 70 75 80 Pro ProArg Val Phe Phe Phe Lys Lys Lys Lys Lys Lys Lys Lys Lys 85 90 95 Gln LysLys Lys Lys Lys Lys Lys Lys Gly Gly Gly Thr Cys Pro Ala 100 105 110 AlaAla 177 43 PRT Homo sapien 177 Met Pro Tyr Leu Arg Leu Trp Lys Asn GlyVal Tyr Ser Pro Cys Asn 1 5 10 15 Phe Leu Gly Glu Lys Lys Pro Phe ProMet Asp Leu Lys Lys Lys Lys 20 25 30 Lys Lys Lys Lys Lys Asn Leu Ala AlaThr Thr 35 40 178 213 PRT Homo sapien 178 Met Thr Ser Asp Glu Ala ThrThr Glu Thr Arg Pro Ala Arg Glu Ala 1 5 10 15 Glu Lys Gly Ala Glu LysGln Lys Ala Thr Glu Lys Gly Lys Thr Lys 20 25 30 Lys Thr Ser Thr Ser TyrArg Arg Ser Gln Arg Met Arg Lys Glu Arg 35 40 45 Arg Arg Arg Lys His GluAla Thr Arg Arg Arg Thr Gly Glu Glu Arg 50 55 60 Glu Asn Arg Gly Arg ArgArg Glu Gln Arg Arg Arg Arg Thr Lys Val 65 70 75 80 Gly Ser Gln Glu GluThr Lys Arg Glu Val Gln Thr Glu Gln Gly Arg 85 90 95 Lys Arg Pro Lys GlyGln Lys Lys Glu Thr Gln Arg Arg Lys Lys Arg 100 105 110 Arg Lys Lys LysSer Gln Arg Arg Arg Thr Gly Lys Arg Lys Gln Glu 115 120 125 Glu Lys ThrThr Gln Arg Glu Arg Arg Glu Lys Asp Lys Arg Ser Arg 130 135 140 Arg GluTrp Lys Tyr Ala Glu Glu Glu Glu Thr Asp Asn Glu Glu Arg 145 150 155 160Arg Arg Lys Lys Arg Lys Arg Gln Gln Lys Lys Arg Glu Lys Lys Arg 165 170175 Arg Ser Lys Lys Ser Arg Ser Lys Asn Glu Ala Asp Lys Glu Arg Ala 180185 190 Glu Thr Thr Arg Arg Glu Glu Arg Glu Arg Glu Thr Glu Glu Glu Lys195 200 205 Thr Arg Asn Arg Ser 210 179 434 PRT Homo sapien 179 Met SerAla Asp Ala Ala Ala Gly Ala Pro Leu Pro Arg Leu Cys Cys 1 5 10 15 LeuGlu Lys Gly Pro Asn Gly Tyr Gly Phe His Leu His Gly Glu Lys 20 25 30 GlyLys Leu Gly Gln Tyr Ile Arg Leu Val Glu Pro Gly Ser Pro Ala 35 40 45 GluLys Ala Gly Leu Leu Ala Gly Asp Arg Leu Val Glu Val Asn Gly 50 55 60 GluAsn Val Glu Lys Glu Thr His Gln Gln Val Val Ser Arg Ile Arg 65 70 75 80Ala Ala Leu Asn Ala Val Arg Leu Leu Val Val Asp Pro Glu Thr Asp 85 90 95Glu Gln Leu Gln Lys Leu Gly Val Gln Val Arg Glu Glu Leu Leu Arg 100 105110 Ala Gln Glu Ala Pro Gly Gln Ala Glu Pro Pro Ala Ala Ala Glu Val 115120 125 Gln Gly Ala Gly Asn Glu Asn Glu Pro Arg Glu Ala Asp Lys Ser His130 135 140 Pro Glu Gln Leu Ser Leu Val Ala Val Ser Asp Gly Ser Val ArgGly 145 150 155 160 Ala Thr Arg Ser Leu Leu Asp Arg Glu Arg Ala Gln PheGly Ile Lys 165 170 175 Arg Gln Asn Pro Ala Leu Pro Gln Leu Gly Gly GluGly Pro Arg Ala 180 185 190 Met Val Ala Glu Leu Gly Gln Arg Glu Leu ArgPro Arg Leu Cys Thr 195 200 205 Met Lys Lys Gly Pro Ser Gly Tyr Gly PheAsn Leu His Ser Asp Lys 210 215 220 Ser Lys Pro Gly Gln Phe Ile Arg SerVal Asp Pro Asp Ser Pro Ala 225 230 235 240 Glu Ala Ser Gly Leu Arg AlaGln Asp Arg Ile Val Glu Val Asn Gly 245 250 255 Val Cys Met Glu Gly LysGln His Gly Asp Val Val Ser Ala Ile Arg 260 265 270 Ala Gly Gly Asp GluThr Lys Leu Leu Val Val Asp Arg Glu Thr Asp 275 280 285 Glu Phe Phe LysLys Cys Arg Val Ile Pro Ser Gln Glu His Leu Asn 290 295 300 Gly Pro LeuPro Val Pro Phe Thr Asn Gly Glu Ile His Lys Asp Pro 305 310 315 320 LeuThr Pro Ser Ser Asp Asn Pro Gln Pro Ser Pro Leu Cys Gln Glu 325 330 335Asn Ser Arg Glu Ala Leu Ala Glu Ala Ala Leu Glu Ser Pro Arg Pro 340 345350 Ala Leu Val Arg Ser Ala Ser Ser Asp Thr Ser Glu Glu Leu Asn Ser 355360 365 Gln Asp Ser Pro Pro Lys Gln Asp Ser Thr Ala Pro Ser Ser Thr Ser370 375 380 Ser Ser Asp Pro Ile Leu Asp Phe Asn Ile Ser Leu Ala Met AlaLys 385 390 395 400 Glu Arg Ala His Gln Lys Arg Ser Ser Lys Arg Ala ProGln Met Asp 405 410 415 Trp Ser Lys Lys Asn Glu Leu Phe Ser Asn Leu AsnGlu Leu Phe Ser 420 425 430 Asn Leu 180 49 PRT Homo sapien 180 Met GlySer Cys Ser Val Ala Gln Val Gly Val Met Trp His Asp Leu 1 5 10 15 GlySer Leu Gln Pro Leu Pro Pro Gly Phe Lys Gln Phe Ser Cys Leu 20 25 30 SerLeu Leu Ser Ser Trp Asp Tyr Arg Arg Thr Cys Pro Gly Gly Arg 35 40 45 Ser181 59 PRT Homo sapien 181 Phe Phe Phe Leu Phe Val Cys Leu Phe Glu MetGly Ser Cys Ser Val 1 5 10 15 Ala Gln Val Gly Val Met Trp His Asp LeuGly Ser Leu Gln Pro Leu 20 25 30 Pro Pro Gly Phe Lys Gln Phe Ser Cys LeuSer Leu Leu Ser Ser Trp 35 40 45 Asp Tyr Arg Cys Glu Pro Gln Arg Leu AlaArg 50 55 182 193 PRT Homo sapien 182 Met Ser Tyr Ser Phe Ala Ser SerVal Val Leu Val Asp Ser Leu Thr 1 5 10 15 Ser Phe Leu Gly Pro Phe ThrPhe Ser Leu Leu Ala Thr Ser Arg Ile 20 25 30 Leu His Leu Tyr Leu Ala ProArg Val Arg Leu Ser Cys Ser Ser Leu 35 40 45 Ser Pro Phe Ala Cys Leu LeuCys Ser Leu Leu Trp Val Arg Val Ser 50 55 60 Ser Ser Ser Thr Arg Ser IleCys Ser Leu Ser Val Phe Cys Val Cys 65 70 75 80 Ser Gly Leu Ser Leu ValCys Val Arg Tyr Phe Phe Ala Leu Cys Ser 85 90 95 Ser Leu Phe Arg Pro CysSer Phe Leu Ser Leu Leu Arg Ser Leu Leu 100 105 110 Leu Ser Ile Leu PhePhe Ser Cys Phe Leu Ala Leu Ser Leu Ser Ser 115 120 125 Leu Ser Ile TyrLeu Pro Leu Leu Ser His Ser Leu Ser Phe Arg Asp 130 135 140 Pro Arg SerIle Val Tyr Leu Ile Phe Asp Phe Leu Ser Leu Tyr His 145 150 155 160 SerLeu Cys Pro Ser Tyr Ser Ser Tyr Ser Ile Asn Asp Ser Arg Gly 165 170 175Leu Ile Pro Thr Arg Ala Leu Pro Gln Cys Ile Arg Tyr Leu Pro Tyr 180 185190 Pro 183 56 PRT Homo sapien 183 Met Trp Cys Arg Cys Val Cys Leu AsnTyr Cys Gln Cys Val Pro Pro 1 5 10 15 Ser Trp Thr Phe Leu Pro Ser LeuMet His Val Gln Tyr Asp Ser His 20 25 30 Glu Asn Asp Glu Pro Cys His GluVal Leu Ile Ala Asn Glu Glu Arg 35 40 45 Leu His Arg Lys Asn Met Lys Lys50 55 184 105 PRT Homo sapien 184 Met Pro Tyr Gly Val Thr Gln Phe LysLeu Thr Arg Ile Val Ser Ala 1 5 10 15 Ile Gly Trp Glu Leu Thr Thr CysAsp Pro Ser Tyr Tyr Thr Pro Val 20 25 30 Leu Thr Leu Ser Leu Leu Lys PheCys Ala Leu Glu His Ile His Lys 35 40 45 Asn Asn Arg Ala Arg Ala Leu GlnGly Asn His Thr Pro Pro Asn Ser 50 55 60 Lys Leu Arg Asn Thr His Ile SerArg Glu Ala Gln Arg Gly Tyr Lys 65 70 75 80 Glu Tyr Cys Ala Arg Gln ArgAsn Pro Gln Thr Pro His Pro Arg Ala 85 90 95 Gln Pro Gly Thr Gln Asn SerLys Asn 100 105 185 38 PRT Homo sapien 185 Met Ile Val Arg Gly Glu ValHis Thr Leu Met His Leu Glu Leu Tyr 1 5 10 15 Cys Ile Ile Arg Thr ThrSer Asp Thr Ser Phe Phe Phe Phe Phe Phe 20 25 30 Phe Phe Pro Tyr Cys Asn35 186 77 PRT Homo sapien 186 Met Val Thr Gly Cys Leu Leu Arg Gln CysAla Asp Arg Cys Gln Val 1 5 10 15 Asn Ser Thr Ala His Phe Trp Leu AsnPhe Leu Gln Leu Ser Ser Val 20 25 30 Arg Ser Lys Val His Leu Gln Pro SerLeu Arg Ala Leu Leu Phe Ser 35 40 45 Ser Ser Val Arg Thr Cys Thr Gly GlnPro Cys Pro Phe Gln Phe Ser 50 55 60 Ala Ser Trp Leu Gly Ala His Arg LeuLeu Ser Asn His 65 70 75 187 13 PRT Homo sapien 187 Met Leu Phe Pro CysVal Lys Leu Val Tyr Ser Ala His 1 5 10 188 44 PRT Homo sapien 188 MetArg Arg Pro Ala Arg Leu Val Glu Arg Ala Val Cys Leu Val Leu 1 5 10 15Glu Phe Leu Phe Phe Ile Ser Phe Leu Ser Cys Asn Ser Tyr Phe Trp 20 25 30Phe Ala Trp Thr Val Leu His Thr Pro Ile Phe Leu 35 40 189 53 PRT Homosapien 189 Met Leu Leu Ser Lys Gly Thr Gly Thr Thr Leu Ile Phe Ile AspGly 1 5 10 15 Met Leu Lys Arg Trp Ala Tyr Ile Tyr Val Pro Tyr Ala CysSer Pro 20 25 30 Gly Cys Gly Gln Trp Cys Ile Pro Ala Pro His Ser Pro HisAsn Leu 35 40 45 Pro Glu Gln His Asp 50 190 84 PRT Homo sapien 190 MetThr Cys Phe Val Asp Asp Cys Cys Gly Asp Leu Gly Thr Glu Lys 1 5 10 15Asn Leu Pro Lys Lys Asn Lys Lys Ala Asn Leu Gly Gly Ile Lys Lys 20 25 30Glu Asn Phe Phe Val Lys Lys Lys Lys Arg Lys Lys Lys Asn Glu Lys 35 40 45Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys 50 55 60Thr Ser Pro Arg His Asp His Thr Leu Arg Ala Arg Met Ile Lys Thr 65 70 7580 Ile Ala Ile Tyr 191 60 PRT Homo sapien 191 Met Gly Arg Leu Val LysPhe Lys His Gly Asn Asn Ser Glu Ile Asn 1 5 10 15 Ser Phe Arg Gly AsnHis Pro Phe Pro Thr Glu Pro Thr Pro Phe Lys 20 25 30 Leu Asn Ser Ser LeuArg Leu Leu Gly Phe Ser Leu Ala Val Lys Ser 35 40 45 Ser Gly Phe Leu LysAsn Asp Gly Leu Pro Trp Lys 50 55 60 192 269 PRT Homo sapien 192 Met AlaAla Ser Gly Ser Gly Met Ser Gln Lys Thr Trp Glu Leu Ala 1 5 10 15 AsnAsn Met Gln Glu Ala Gln Ser Ile Asp Glu Ile Tyr Lys Tyr Asp 20 25 30 LysLys Gln Gln Gln Glu Ile Leu Ala Ala Lys Pro Trp Thr Lys Asp 35 40 45 HisHis Tyr Phe Lys Tyr Cys Lys Ile Ser Ala Leu Ala Leu Leu Lys 50 55 60 MetVal Met His Ala Arg Ser Gly Gly Asn Leu Glu Val Met Gly Leu 65 70 75 80Met Leu Gly Lys Val Asp Gly Glu Thr Met Ile Ile Met Asp Ser Phe 85 90 95Ala Cys Leu Trp Gln Gly Thr Glu Thr Arg Val Asn Ala Gln Ala Ala 100 105110 Ala Tyr Glu Tyr Met Ala Ala Tyr Ile Glu Asn Ala Lys Gln Val Gly 115120 125 Arg Leu Glu Asn Ala Ile Gly Trp Tyr His Ser His Pro Gly Tyr Gly130 135 140 Cys Trp Leu Ser Gly Ile Asp Val Ser Thr Gln Met Leu Asn GlnGln 145 150 155 160 Phe Gln Glu Pro Phe Val Ala Val Val Ile Asp Pro ThrArg Thr Ile 165 170 175 Ser Ala Gly Lys Val Asn Leu Gly Ala Phe Arg ThrTyr Pro Lys Gly 180 185 190 Tyr Lys Pro Pro Asp Glu Gly Pro Ser Glu TyrGln Thr Ile Pro Leu 195 200 205 Asn Lys Ile Glu Asp Phe Gly Val His CysLys Gln Tyr Tyr Ala Leu 210 215 220 Glu Val Ser Tyr Phe Lys Ser Ser LeuAsp Arg Lys Leu Leu Glu Leu 225 230 235 240 Leu Trp Asn Lys Tyr Trp ValAsn Thr Leu Ser Ser Ser Ser Leu Leu 245 250 255 Thr Asn Ala Asp Tyr ThrThr Gly Gln Val Phe Asp Leu 260 265 193 146 PRT Homo sapien 193 Met TrpCys Ser Tyr Pro Tyr Cys Cys Ser Gly Phe Leu Leu Ser Tyr 1 5 10 15 ThrVal Cys Thr His Gly Val Asn Ile Gly Cys Val Cys Cys Leu Ser 20 25 30 ArgTrp Trp Leu Ser Leu Val Met Val Pro Val Pro Cys Val Val Val 35 40 45 PheThr Ala Cys Trp Val Cys Val Trp Ser Ser Glu Pro His Leu Met 50 55 60 AspMet Trp Val Arg Pro Val Val His Phe Leu Ala Met Cys His Val 65 70 75 80Pro Arg Val Cys Ser Leu Phe Pro Leu Leu Val Cys Ala Cys Ser Phe 85 90 95Leu Phe Leu Leu Gly Ile Leu Ala Leu Cys Pro Pro Val Ala Leu Tyr 100 105110 Ser Leu Gly Val Cys Val Ser Pro Pro Val Ile Cys Ser Pro Ala Cys 115120 125 Glu Ile Trp Trp Val Cys Arg Ala Pro Ser Cys Ala Leu Tyr Pro Leu130 135 140 Arg Pro 145 194 141 PRT Homo sapien 194 Met Cys Ala His ThrHis Gly Ala Gly His Thr Ala Leu His Phe Gly 1 5 10 15 Arg His Ala GlnVal Phe Ile Arg Arg Ala Arg Gly Leu Ser Ser Ser 20 25 30 Arg Ile Thr HisSer Glu Ser Tyr Cys Leu Leu Pro Ser Leu His Thr 35 40 45 Gln Gly Thr ProArg Ser Arg Gly Arg Pro Thr Arg Gly Val Ser Leu 50 55 60 Ser Ser Arg AlaLeu Val Leu Arg Arg Glu Val Leu Gly Asp Thr His 65 70 75 80 Thr His ThrPro Glu Ser Gly Asp Thr Arg Tyr Arg Asp Cys Leu His 85 90 95 Thr Lys IlePhe Tyr Asn Ile Glu Ile Cys Gly Ser Arg Thr Gln His 100 105 110 Ile TrpAla Pro Ala His Thr Glu Thr Leu Ser Ser Leu Ser His Arg 115 120 125 AlaVal Ala Pro Leu Leu His Arg Glu Ser Gly Glu Pro 130 135 140 195 95 PRTHomo sapien 195 Met Ser Ser His Leu Thr Asn Ser Cys Val Phe Pro Lys TyrSer Ser 1 5 10 15 Leu Phe Thr Gln Gly Leu Val Val Lys Ile Tyr Gln HisPro Gly Ile 20 25 30 Lys Phe Ser Leu Trp Glu Ser Leu Phe His Lys Lys TrpAla Pro Gly 35 40 45 Phe Leu Thr Pro Leu Val Trp Lys Met Leu Trp Gly GluMet Glu Lys 50 55 60 Ser His Phe Leu Leu Tyr Leu Asn Ala Gly Gly Glu ThrSer Trp Ala 65 70 75 80 Asn Ser Arg Val Pro Val Val Gly Lys Trp Leu SerPro Pro Gln 85 90 95 196 54 PRT Homo sapien 196 Met Arg Thr Val Val IlePro Glu Gly Trp Gly Gly Asp Arg Leu Gly 1 5 10 15 Glu Gly Phe Arg LysLeu Ser Glu Asp Asp Cys Asn Gly Leu Asn Phe 20 25 30 Gly Lys Val Trp LeuHis Arg Cys Ile Cys Leu Gln Glu Leu Ser Lys 35 40 45 Phe Ile Leu Lys IleCys 50 197 240 PRT Homo sapien 197 Met Pro Pro Leu Leu Phe Glu Val SerSer Leu Glu Asn Ala Phe Gln 1 5 10 15 Ile Gly Gly His Pro Trp His TyrIle Val Thr Pro Asn Lys Lys Lys 20 25 30 Gln Lys Gly Val Phe His Ile CysAla Leu Lys Asp Asn Ser Leu Ala 35 40 45 Lys Asn Gly Ile Gln Glu Met AspCys Cys Ser Leu Glu Ser Asp Trp 50 55 60 Ile Tyr Phe His Pro Asp Ala SerGly Arg Ile Ile His Val Gly Pro 65 70 75 80 Asn Gln Val Lys Val Leu LysLeu Thr Glu Ile Glu Asn Asn Ser Ser 85 90 95 Gln His Gln Ile Ser Glu AspPhe Val Ile Leu Ala Asn Arg Glu Asn 100 105 110 His Lys Asn Glu Asn ValLeu Thr Val Thr Ala Ser Gly Arg Val Val 115 120 125 Lys Lys Ser Phe AsnLeu Leu Asp Asp Asp Pro Glu Gln Glu Thr Phe 130 135 140 Lys Ile Val AspTyr Glu Asp Glu Leu Asp Leu Leu Ser Val Val Ala 145 150 155 160 Val ThrGln Ile Asp Ala Glu Gly Lys Ala His Leu Asp Phe His Cys 165 170 175 AsnGlu Tyr Gly Thr Leu Leu Lys Ser Ile Pro Leu Val Glu Ser Trp 180 185 190Asp Val Thr Tyr Ser His Glu Val Tyr Phe Asp Arg Asp Leu Val Leu 195 200205 His Ile Glu Gln Lys Pro Asn Arg Val Phe Ser Cys Tyr Val Tyr Gln 210215 220 Met Ile Cys Asp Thr Gly Glu Glu Glu Glu Thr Ile Asn Arg Ser Cys225 230 235 240 198 31 PRT Homo sapien 198 Met Ile Pro Gln Leu Gly GluSer Val Leu Ile His Cys Pro Asn Gly 1 5 10 15 Pro Pro Leu Pro His ValSer Pro Pro Ser Ser Asn Pro Ser Tyr 20 25 30 199 62 PRT Homo sapien 199Met Pro Ala Pro Leu Gly Gly Arg Gly Gly Trp Ser Pro Pro Arg Ser 1 5 1015 Arg Ser Ser Arg Gln Arg Leu Ala Asp Met Ala Lys Pro Arg Leu Tyr 20 2530 Tyr Lys Lys Asn Thr Lys Arg Leu Asp Trp Val Trp Trp Cys Val Pro 35 4045 Ile Ile Pro Ala Thr Gln Glu Ala Glu Ala Gly Glu Phe Phe 50 55 60 200245 PRT Homo sapien 200 Met Gly Arg Ser Cys Val Val Cys Phe Val Cys LeuPhe Phe Ser Phe 1 5 10 15 Val Phe Arg Leu Ser Ser Arg Ala Val Ala AlaLeu Arg Phe Ser Val 20 25 30 Cys Val Val Arg Arg Val Arg Leu Ala Ala SerSer Phe Val Leu Arg 35 40 45 Arg Ser Ala Leu Ser Leu Ser Ser Val Ser SerLeu Val Ser Pro Ala 50 55 60 Leu Leu Pro Leu Arg Ser Leu Ser Ser Ser SerPhe Leu Ser Pro Phe 65 70 75 80 Val Ala Pro Cys Leu Ser Val Cys Phe ValPro Val Leu Val Cys Leu 85 90 95 Ser Ser Ala Phe Ala Ser Leu Ser Arg SerCys Ser Phe Leu Leu Ser 100 105 110 Val Arg Phe Ala Phe Ser Val Ser ArgVal Gly Leu Phe Cys Val Leu 115 120 125 Phe Leu Leu Cys Leu Ala Arg LeuSer Ser Val Phe Ala Ser Cys Ser 130 135 140 Gly Phe Ser Leu Leu Phe PhePhe Leu Leu Phe Phe Phe Phe Cys Phe 145 150 155 160 Leu Ser Leu Cys LeuSer Phe Phe Phe Ser Phe Leu Phe Phe Pro Ser 165 170 175 Trp Cys Leu PheSer Phe Leu Phe Phe Ala Phe Ser Ser Ile Cys Phe 180 185 190 Cys Leu LeuTrp Asp Asn Phe Leu Phe Val Phe Leu Ala Ile Phe Ser 195 200 205 Ser ValPhe Ser Ser Leu His Cys Val Phe Leu Phe Ser Ser Phe Val 210 215 220 ProPro Leu Tyr Phe Val Ile Phe Ser Phe Ala Leu Trp Tyr Ser Cys 225 230 235240 Trp Arg Pro Gly Val 245 201 144 PRT Homo sapien 201 Glu Gln Met SerCys Gln Trp Glu Phe Lys Cys Gln His Gly Glu Glu 1 5 10 15 Glu Cys LysPhe Asn Lys Val Glu Ala Cys Val Leu Asp Glu Leu Asp 20 25 30 Met Glu LeuAla Phe Leu Thr Ile Val Cys Met Glu Glu Phe Glu Asp 35 40 45 Met Glu ArgSer Leu Pro Leu Cys Leu Gln Leu Tyr Ala Pro Gly Leu 50 55 60 Ser Pro AspThr Ile Met Glu Cys Ala Met Gly Asp Arg Gly Met Gln 65 70 75 80 Leu MetHis Ala Asn Ala Gln Arg Thr Asp Ala Leu Gln Pro Pro His 85 90 95 Glu TyrVal Pro Trp Val Thr Val Asn Gly Lys Pro Leu Glu Asp Gln 100 105 110 ThrGln Leu Leu Thr Leu Val Cys Gln Leu Tyr Gln Gly Lys Lys Pro 115 120 125Asp Val Cys Pro Ser Ser Thr Ser Ser Leu Arg Ser Val Cys Phe Lys 130 135140 202 76 PRT Homo sapien 202 Met Pro Ser Asp Arg Met His Leu Phe IleLeu Lys Met Ala Ser Leu 1 5 10 15 Arg His Pro Thr Gly Gln Pro Cys LysLeu Lys Ser Gln Gly Ala His 20 25 30 Cys Thr Gln Leu Ser His Ala Leu ThrThr Ala Ser Leu Gln Leu Leu 35 40 45 Thr Leu Gly Tyr Asn Ser Ser Asn IleAsn Gly Phe Ser Leu Gln His 50 55 60 Cys Thr Leu Gln Asn Ile Glu Gln GlyPhe Ser Leu 65 70 75 203 60 PRT Homo sapien 203 Asp Ala Lys Glu Asp HisGlu Arg Thr His Gln Met Val Leu Leu Arg 1 5 10 15 Lys Leu Cys Leu ProMet Leu Cys Phe Leu Leu His Thr Ile Leu His 20 25 30 Ser Thr Gly Gln TyrGln Glu Cys Leu Gln Leu Ala Asp Met Val Ser 35 40 45 Ser Glu Gly His LysLeu Tyr Leu Val Ser Ser Arg 50 55 60 204 96 PRT Homo sapien 204 Met CysLeu Val Ser Phe Val Val Phe Ile Phe Leu Ser Asn Thr Pro 1 5 10 15 GlyPro Phe Phe Ser Phe Ser Leu Gly Leu Phe Ser Phe Ala Phe Leu 20 25 30 PheLeu Gln Leu Phe Phe Phe Leu Val Leu Phe Ser Phe Leu Ile Phe 35 40 45 LeuLeu Val Phe Ser Val Phe Ser Leu Leu Asp Phe Tyr Phe Tyr Met 50 55 60 PheVal Phe Ser Phe Phe Ser Leu Leu Ser Leu Phe Ser Phe Leu Leu 65 70 75 80Phe Phe Tyr Val Val Val Leu Ser Trp Ile Leu Asp Trp Ile Phe Arg 85 90 95205 34 PRT Homo sapien 205 Met Met Asp Asp Thr Leu Pro Gly Thr Leu ValHis Tyr Ser Gln Cys 1 5 10 15 Ser Ser Ser Ala Tyr Asn Ser Cys Leu ProVal Asp Ser Thr Asn Glu 20 25 30 Ser Gly 206 42 PRT Homo sapien 206 MetPro Val Val Pro Ala Ile Trp Glu Ala Lys Glu Asp Arg Leu Ser 1 5 10 15Ser Gly Asp Arg Gly Cys Ser Trp Ala Glu Ile Ala Pro Gln Pro Ser 20 25 30Ser Leu Val Lys Arg Glu Arg Phe His Leu 35 40 207 111 PRT Homo sapien207 Leu Phe Val Tyr Ala Arg Trp Asn Leu Ser Leu Leu Thr Arg Leu Glu 1 510 15 Gly Cys Gly Ala Ile Ser Ala Gln Cys Asn Leu Tyr Leu Leu Ser Ser 2025 30 Ser Asp Pro Ser Leu Ala Ser Gln Ile Ala Gly Thr Thr Gly Met Cys 3540 45 His His Val Gln Leu Ile Leu Tyr Phe Ala Ala Arg Arg Phe Tyr His 5055 60 Val Gly Gln Gly Gly Leu Glu Leu Leu Ala Ala Ser Gly Pro Pro Ser 6570 75 80 Ser Ala Tyr Gln Ser Ala Val Ile Thr Gly Val Ser His His Ala Gln85 90 95 Pro Leu Asn Ser Val Phe Tyr Ser Lys Ala Lys Ala His Val Phe 100105 110 208 81 PRT Homo sapien 208 Met Leu Ala Leu Phe Val Val Gly GlyCys Pro Cys Ser Phe Gln Tyr 1 5 10 15 Met Arg Gly Gln Gly Asp Pro ArgGly Pro Phe Cys Gly Pro Leu Trp 20 25 30 Lys Lys Gly Arg Arg Tyr Val SerCys Leu Ile Thr Ser Ile Lys Pro 35 40 45 Val Ala Cys Ile Ser Leu Lys CysAla Ile Tyr Ala Gly Ser Ser Gly 50 55 60 Gly Val Ile Tyr Val Trp Ala ProPro Arg Ala Pro Asn Thr Pro Leu 65 70 75 80 Tyr 209 67 PRT Homo sapien209 Met Lys Val Pro His Gln Arg Lys Lys Asn Lys Asn Thr Lys Lys Arg 1 510 15 Lys Lys Lys Lys Lys Val Leu Trp Gly Gly Tyr Thr Thr Cys Gly His 2025 30 Asn Ile Gly Val Leu Pro Gly Val Cys Cys Ala Arg Thr Thr Trp Cys 3540 45 Cys Val Ile Ile Thr Gly Gly Phe Ser Asp Lys Phe Phe Arg Asp Lys 5055 60 Lys Asn Leu 65 210 80 PRT Homo sapien 210 Met Phe Met Cys Ile CysTyr Leu Pro Asn Tyr Ile Thr Ser Ser Leu 1 5 10 15 Lys Val Glu Met SerMet Glu Thr Asp Asn Met Ser Gly Leu Leu Leu 20 25 30 His Thr Leu Gln ValSer Ala His Leu Ile Phe Ile Ala Thr Leu Arg 35 40 45 Asn Ser His Cys TyrPro His Phe Ile Ser Arg Gln Gly Lys Val Lys 50 55 60 Ser Gly Lys Val TyrLeu Trp His Lys Leu Leu Asn Glu Gly Thr Tyr 65 70 75 80 211 125 PRT Homosapien 211 Met Ser Ser Glu Val Ser Val Trp Glu Phe Val Gly Ala Gly GlyLeu 1 5 10 15 His Gln Ser Val Ser Lys Gln Pro Arg Gly Lys Ala Lys ProLeu Val 20 25 30 Gly Asn Pro Tyr Trp Ser Phe Asn Arg Leu Ser Lys Gly LeuPhe Trp 35 40 45 Lys Trp Glu Lys Ala Cys Cys Leu Pro Thr Gly Gly Glu ThrThr Val 50 55 60 Phe Gly Gly Leu Phe Pro Lys Leu Val Ser Lys Gly Asn CysTrp Phe 65 70 75 80 Pro Val Phe Gln Lys Gly Asn Gly Phe Ser Val Ser GlyTrp Gly Ser 85 90 95 Asn Pro Val Leu Val Leu Gly Gly Val Asn Pro Arg ProLys Lys Ile 100 105 110 Lys Leu Glu Thr Ser Pro Tyr Thr Ala Lys Ser TrpGly 115 120 125 212 167 PRT Homo sapien 212 Met Arg Thr Trp Trp Cys ArgVal Leu Glu Val Arg His Val Ala Lys 1 5 10 15 Gly Gly Ala Pro Leu ArgLeu Arg Phe Leu Trp Arg Ser Val Ser Pro 20 25 30 Ala Cys Arg Glu Lys GluIle Ser Leu Ala Gln Thr His Asn Thr Arg 35 40 45 Met Arg Thr His Asn LeuLys Asp Tyr Lys Arg Lys Ser Leu Arg Arg 50 55 60 Asn Asn Leu Leu Arg AlaAla Ala His Ser His Val Leu Trp Arg Val 65 70 75 80 Ser Pro Thr Tyr SerHis His His Thr Met Cys Ala Val Thr Arg Cys 85 90 95 Thr Pro Arg Gly ValLeu Pro Ser Arg Gly Ser Ser Arg Val Cys Val 100 105 110 Lys Arg Ala ThrHis Arg Phe Arg Cys Ile Leu Tyr Ser Glu Asp Leu 115 120 125 Trp Val PheIle His Ser Val Val Ser Ile Pro Phe Val Pro Val Gly 130 135 140 Val LysIle Trp Leu Pro Ala Leu Thr Ile Leu Pro Thr Thr Cys Gly 145 150 155 160Thr Lys Asp Thr Pro Leu Phe 165 213 151 PRT Homo sapien 213 Met His AlaArg Ala Ala Gln Cys Asp Gly Phe Ala Ala Arg Ser Pro 1 5 10 15 Pro PhePhe Phe Phe Phe Phe Phe Phe Phe Leu Gly Arg Gly Lys Asn 20 25 30 Phe PhePhe Phe Phe Ile Phe Ser Gln Lys Pro Phe Phe Trp Lys Lys 35 40 45 Leu LysVal Ala Met Arg Gly Phe Leu Tyr Lys Lys Asn Ile Lys Thr 50 55 60 Arg GlyIle Leu Leu Phe Gln Lys Lys Phe Asn Leu Leu Phe Val Asp 65 70 75 80 LysAla His His Glu Trp Val Tyr Lys Leu Val Leu Ser Tyr Ile Phe 85 90 95 GlnArg Lys Tyr Tyr Ser His Ser Val His Val Tyr Ser Ile Thr Val 100 105 110Cys Ser Arg Arg Lys Ser Arg Arg Ala Cys Asn Ser Leu Gly Val His 115 120125 Lys Cys Val Leu Pro Leu Cys Glu Ile Leu Cys Phe Ile Pro Val Pro 130135 140 Gln Tyr Ser His Asn Asn Ile 145 150 214 118 PRT Homo sapien 214Met Leu Cys Arg Ser Val Cys Asp Tyr Pro Pro Ala Arg Val Arg Arg 1 5 1015 Glu Val Val Val Cys Asn Thr Lys Arg Gly Gly Gly Arg Arg Arg Glu 20 2530 Gln Pro Ser Ile Thr Arg Val Ala Ala Leu Ile Tyr Ile Tyr Met Val 35 4045 Glu Gly Glu Ile Lys His Ile Ser Arg Glu Arg Glu Gly Glu Arg Ala 50 5560 Asn Pro Thr Thr Ala Gly Gln Gln Glu Ala Ile Ser Arg Gly Glu Glu 65 7075 80 Glu Arg Gly Cys Ser Ala Arg Arg Ala Pro Thr Pro Pro His Asn Thr 8590 95 Leu Tyr Arg Thr Gln Gln Thr Lys Pro Gln Pro Arg Thr Gln Ser Thr100 105 110 Arg Glu Tyr Lys Lys Ile 115 215 72 PRT Homo sapien 215 MetVal Ala Met Ile Ile Arg Ser Ile Phe Val Gly Leu Leu Ala His 1 5 10 15Ser Cys Cys His Ala Gly Asp Asp Thr Phe Arg Ala Pro Leu Ala Leu 20 25 30Ile Leu Glu Leu Leu His Leu Ile Val Val Gly Phe Trp Asp Ser Val 35 40 45Ser Val His Ile Asp Thr Pro Pro Glu Glu Leu Leu Met Ile Phe Phe 50 55 60Leu Gln Gln Cys Ser Tyr Val Val 65 70 216 58 PRT Homo sapien 216 Met CysHis Cys Pro Arg Val Pro Pro Ile Pro Gln Ala Thr Asn Phe 1 5 10 15 ValThr Arg Glu Gln Ile Gln Glu Ile Ser Ser Gln Ala Lys Val Gln 20 25 30 SerAla Ala Asn His Gly Arg His Ala Glu Pro Arg Arg Arg Cys Ala 35 40 45 SerLeu Val Pro Gly Ser Asp Gly Ala Ala 50 55 217 121 PRT Homo sapien 217Met Gly Gln Asn Gly Val Ser Pro Gly Gly Lys Cys Gly Cys Thr Gly 1 5 1015 Leu Lys Ile Pro Thr Lys Gln Phe Glu Thr Thr Lys Asn Glu Gln Gln 20 2530 Gln Glu Lys Glu Glu Gln Thr Arg His Thr Arg Asn Arg Arg Arg Arg 35 4045 Glu Arg Glu Arg Asn Thr Asn Thr Gln Gln Pro Arg Lys Asp Glu Lys 50 5560 Glu Arg Glu Lys Arg Glu Arg Lys Glu Glu Lys Arg Glu Asn Lys Lys 65 7075 80 Lys Glu His Gln Lys Glu Lys Lys Asn Thr Lys Thr Arg Gln His Thr 8590 95 Lys Gln Arg Lys Thr Gly Arg Thr Thr Lys Glu Asp Lys Asn Ser Asn100 105 110 Glu Lys Gln Glu Arg Thr Lys Thr Lys 115 120 218 67 PRT Homosapien 218 Gly Pro Gln Gly Pro Pro Gly Tyr Gly Lys Met Gly Ala Thr GlyPro 1 5 10 15 Met Gly Gln Gln Gly Ile Pro Gly Ile Pro Gly Pro Pro GlyPro Met 20 25 30 Gly Gln Pro Gly Lys Ala Gly His Cys Asn Pro Ser Asp CysPhe Gly 35 40 45 Ala Met Pro Met Glu Gln Gln Tyr Pro Pro Met Lys Thr MetLys Gly 50 55 60 Pro Phe Gly 65

We claim:
 1. An isolated nucleic acid molecule comprising (a) a nucleicacid molecule comprising a nucleic acid sequence that encodes an aminoacid sequence of SEQ ID NO: 116 through 218; (b) a nucleic acid moleculecomprising a nucleic acid sequence of SEQ ID NO: 1 through 115; (c) anucleic acid molecule that selectively hybridizes to the nucleic acidmolecule of (a) or (b); or (d) a nucleic acid molecule having at least60% sequence identity to the nucleic acid molecule of (a) or (b).
 2. Thenucleic acid molecule according to claim 1, wherein the nucleic acidmolecule is a cDNA.
 3. The nucleic acid molecule according to claim 1,wherein the nucleic acid molecule is genomic DNA.
 4. The nucleic acidmolecule according to claim 1, wherein the nucleic acid molecule is amammalian nucleic acid molecule.
 5. The nucleic acid molecule accordingto claim 4, wherein the nucleic acid molecule is a human nucleic acidmolecule.
 6. A method for determining the presence of a breast specificnucleic acid (BSNA) in a sample, comprising the steps of: (a) contactingthe sample with the nucleic acid molecule according to claim 1 underconditions in which the nucleic acid molecule will selectively hybridizeto a breast specific nucleic acid; and (b) detecting hybridization ofthe nucleic acid molecule to a BSNA in the sample, wherein the detectionof the hybridization indicates the presence of a BSNA in the sample. 7.A vector comprising the nucleic acid molecule of claim
 1. 8. A host cellcomprising the vector according to claim
 7. 9. A method for producing apolypeptide encoded by the nucleic acid molecule according to claim 1,comprising the steps of (a) providing a host cell comprising the nucleicacid molecule operably linked to one or more expression controlsequences, and (b) incubating the host cell under conditions in whichthe polypeptide is produced.
 10. A polypeptide encoded by the nucleicacid molecule according to claim
 1. 11. An isolated polypeptide selectedfrom the group consisting of: (a) a polypeptide comprising an amino acidsequence with at least 60% sequence identity to of SEQ ID NO: 116through 218; or (b) a polypeptide comprising an amino acid sequenceencoded by a nucleic acid molecule comprising a nucleic acid sequence ofSEQ ID NO: 1 through
 115. 12. An antibody or fragment thereof thatspecifically binds to the polypeptide according to claim
 11. 13. Amethod for determining the presence of a breast specific protein in asample, comprising the steps of: (a) contacting the sample with theantibody according to claim 12 under conditions in which the antibodywill selectively bind to the breast specific protein; and (b) detectingbinding of the antibody to a breast specific protein in the sample,wherein the detection of binding indicates the presence of a breastspecific protein in the sample.
 14. A method for diagnosing andmonitoring the presence and metastases of breast cancer in a patient,comprising the steps of: (a) determining an amount of the nucleic acidmolecule of claim 1 or a polypeptide of claim 6 in a sample of apatient; and (b) comparing the amount of the determined nucleic acidmolecule or the polypeptide in the sample of the patient to the amountof the breast specific marker in a normal control; wherein a differencein the amount of the nucleic acid molecule or the polypeptide in thesample compared to the amount of the nucleic acide molecule or thepolypeptide in the normal control is associated with the presence ofbreast cancer.
 15. A kit for detecting a risk of cancer or presence ofcancer in a patient, said kit comprising a means for determining thepresence the nucleic acid molecule of claim 1 or a polypeptide of claim6 in a sample of a patient.
 16. A method of treating a patient withbreast cancer, comprising the step of administering a compositionaccording to claim 12 to a patient in need thereof, wherein saidadministration induces an immune response against the breast cancer cellexpressing the nucleic acid molecule or polypeptide.
 17. A vaccinecomprising the polypeptide or the nucleic acid encoding the polypeptideof claim 11.